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I.  Introduction 


This  final  report  summarizes  the  work  performed  under  the  AFOSR  grant 
number  AFOSR-89-0072,  originally  entitled  "Equilibria,  Lattices,  and  Chaotic 
Dynamics  of  Point  Vortices".  The  aim  of  the  proposed  research  effort  involved 
theoretical  work,  both  analytic  and  numerical,  on  a  number  of  different 
problems  which  were  all  loosely  tied  together  as  involving  some  aspect  of  vortex 
systems,  and  their  relation  to  chaos  in  fluid  flows.  Significant  results  were 
obtained  during  this  funding  period  in  several  major  topics.  The  first  topic 
which  was  investigated  was  a  continuation  of  the  authors  previous  work  on 
vortex  lattices.  ResuUs  consisted  of  the  refinement  of  the  analytic  expression  for 
the  lattice  summation  of  an  infinite  lattice  of  point  vortices,  and  use  of  this 
expression  to  calculate  the  allowed  lattice  structures  of  the  two-component 
triangular  lattice.  It  was  also  shown  how  these  expressions  can  be  used  to 
calculate  the  bulk  physical  properties  of  vortex  lattices,  by  calculating  the  energy 
of  slip  displacement  for  the  triangular  lattice. 

A  second  major  topic  which  was  researched  during  this  period  was  the 
investigation  of  chaotic  motion  in  fluid  flows  due  to  vortex  dynamics.  One 
aspect  which  was  investigated  was  to  demonstrate  the  presence  of  low 
dimensional  chaos  in  an  actual  experimental  open  flow.  This  work  used 
experimental  data  from  the  erratic  fluid  flow  downstream  of  a  cylinder,  obtained 
from  INLS's  fluids  lab,  and  utilized  a  new  technique  developed  by  the  author  for 
measuring  the  dimension  of  a  chaotic  system  from  a  time  series.  The  motion  was 
found  to  have  an  apparent  dimension  of  four,  which  agrees  with  an  alternative 
model  of  the  system.  Another  problem  involving  chaos  due  to  vortices  was  an 
analytic  and  numerical  investigation  of  the  motion  resulting  from  the  interaction 
of  a  vortex  in  an  open  flow  and  a  stationary  bluff  body,  with  small  sinusoidal 
perturbation.  It  was  found  that,  for  the  proper  parameter  regimes,  the  passing 
vortex  could  actually  be  'chaotically  trapped'  around  the  body  for  significant 
periods  of  time.  This  behaviour  is  a  relatively  new  phenomenon  and  one  of 
possibly  large  significance. 

Finally,  motivated  by  an  interest  in  measuring  and  describing  low¬ 
dimensional  chaos  in  vortex  and  fluid  flows  in  general,  a  significant  amount 
results  were  obtained  in  the  general  theory  of  nonlinear  analysis.  An  extensive 
method  was  developed  for  prediction  of  chaotic  motion  based  on  a  global, 
functional  description  of  the  attractor,  and  utilizing  only  a  time  series  of  data  as 


input.  This  method  was  found  to  have  superior  predictive  ability  over  most 
commonly  used  methods.  Another  significant  result  was  the  development  of  a 
new  method  for  the  determination  of  the  minimum  embedding  dimension 
necessary  to  reconstruct  the  attractor  for  a  system.  This  method  is  an  entirely 
new  approach  based  on  information  theory,  and  offers  an  alternative  technique  to 
the  ubiquitous  Grassberger-Procaccia  algorithm  (its  initial  uses  were  to  analyze 
the  experimental  data  mentioned  above).  In  addition  to  these  two  major 
projects,  results  were  also  obtained  in  work  done  on  the  investigation  of  a  new 
type  of  dynamical  mapping,  which  has  interesting  property  of  being  locally 
conservative  but  globally  dissipative.  Lastly,  work  was  begun  and  is  still 
ongoing  in  developing  a  new  method  for  calculating  the  lyapunov  exponents  of  a 
chaotic  system  from  a  time  series  of  data,  in  the  presence  of  additive  guassian 
noise. 

In  all,  work  performed  during  the  funding  period  has  resulted  in  three 
published  papers,  two  papers  currently  in  review,  and  two  papers  in  preparation 
(all  of  these  are  included  as  appendices).  Versions  of  the  pre-prints  and 
published  papers  will  be  supplied  as  they  become  available. 


II.  Vortex  Lattices 


This  section  describes  analytic  and  numerical  work  done  in  the  calculation 
of  the  lattice  summations  for  the  energy  of  a  two-dimensional,  infinite  lattice  of 
point  vortices.  This  work  was  done  in  collaboration  with  L.J.Campbell  and 
M.M.Doria  of  Los  Alamos  National  Laboratory,  and  Physical  Review 
publication  resulting  from  this  work  appears  as  Appendix  A. 

The  calculation  of  properties  (such  as  the  energy  density)  for  infinite 
lattices  with  Coulomb  interactions  has  a  long  history.  In  particular,  the  two- 
dimensional  case  which  arises  from  lattices  of  point  vortices  is  an  example  of  this 
problem,  as  both  satisfy  a  logarithmic  potential,  and  with  the  only  difference 
being  that  of  the  resulting  dynamics.  Vortex  lattices  are  of  importance  both 
because  of  their  mathematical  properties,  as  well  as  being  reasonable  models  for 
superfluid  helium  systems,  systems  of  line  charges  or  mrrents,  screw 
dislocations  in  crystals,  vortex-like  interactions  in  the  quantum  hall  effect,  and 
most  recently  as  a  mechanism  for  'high  Tc'  superconductors.  Until  the  present 
work,  no  closed  form  expression  existed  for  these  lattice  summations,  although 
there  had  been  several  attempts  to  derive  them(see  references  in  Appendix  A). 
Numerical  simulations  thus  consisted  only  of  clever  ways  to  do  the  summations 
explicitly. 

The  principle  result  of  the  work  discussed  here  is  the  development  of  an 
expression  for  the  energy  density  of  a  lattice  of  point  vortices  in  terms  of  a  very 
rapidly  convergent  product  expansion.  Although  a  more  primitive  form  of  the 
summation  was  developed  prior  to  this  funding  period,  these  results  represent  a 
considerable  improvement  in  the  formalism  involved,  as  well  a  a  more  sound 
understanding  of  the  physical  interpretation.  In  particular,  the  correct  form  of 
the  normalization  was  finally  understood,  as  well  as  a  more  general 
understanding  of  the  form  and  interpretation  of  the  artificial  neutralizing 
background  one  must  add  to  cancel  the  mathematical  singularities  involved  in  the 
infinite  summation  (see  the  introduction  in  Appendix  A).  The  final  results  of 
this  reformulation  is  summarized  in  Eq.21  of  Appendix  A;  th;s  equation  gives 
the  energy  density  of  of  a  lattice  of  vortices  with  given  species  of  vortices  of 
given  strengths.  This  equation  allows  for  any  arbitrarily  shaped,  four-sided  unit 
cell  and  also  for  arbitrary  numbers  and  strengths  of  vortices.  The  correct 
normalization  now  allows  for  the  correct  comparison  between  different  lattices 
with  similar  number  densities  for  different  species.  Since  derivatives  and  other 


operations  can  easily  be  performed  on  the  expression,  it  is  also  suitable  for  the 
calculation  of  bulk  properties  of  the  lattices  and  even  to  investigate  lattice 
dynamics.  In  its  present  form  this  expression  is  directly  applicable  to  a  wide 
variety  of  problems  associated  with  this  type  of  logarithmic-potential  lattice,  such 
as  those  mentioned  above. 

In  addition  to  the  above  results,  this  formulation  was  then  used  to  obtain 
several  new  results  for  vortex  lattices.  Previously,  only  properties  of  the  single¬ 
species,  single-vortex  square  and  triangular  lattices  could  be  calculated.  Section 
IV  of  Appendix  A  presents  the  lattice  shapes  allowable  for  several  new  varieties 
of  vortex  lattices.  These  shapes  are  all  calculated  by  similar  means:  minima  of 
the  lattice  energy  density  are  found  by  sweeping  through  the  lattice  parameters, 
and  actual  lattice  configurations  are  assumed  to  exist  for  these  minima.  Section 
IV  shows  several  new  lattice  structures,  and  gives  some  of  the  lattice  energies, 
for  which  the  energy  of  the  square  and  triangular  lattices  is  verified.  It  should 
be  noted  that  many  other  new  configuradons  of  various  types  of  lattices  have 
been  generated,  and  that  only  the  two  dimensional  case  has  been  presented  in  the 
paper.  Other  configurations  may  eventually  be  published  elsewhere. 

Finally,  some  results  were  obtained  on  the  calculation  of  some  bulk  lattice 
properties  using  the  energy  density  formulation.  The  property  of  significant 
physical  interest  for  the  lattices  of  this  particular  type  was  the  energy  of  slip 
displacement.  This  energy  in  effect  measures  the  'rigidness'  of  the  lattice  to 
deformation  along  one  of  its  principal  directions.  Figure  3  of  Appendix  A 
presents  a  summary  of  numerical  calculations  done  to  measure  the  energy  of  slip 
displacement  for  a  triangular  vortex  lattice.  Calculations  of  this  type  were 
entirely  impossible  previously.  This  calculation  is  also  primarily  useful  as  an 
example  of  the  practicality  of  Eq,21  for  the  calculation  of  a  wide  variety  of 
properties  associated  with  logarithmic-potential  lattices. 

It  should  be  noted  that  although  the  author  has  not  pursued  work  on  this 
project  much  beyond  what  is  presented  here,  L.J.Campbell  has  demonstrated  the 
importance  of  these  results  by  continuing  to  apply  this  formalism  to  a  number  of 
outstanding  problems,  in  particular  some  aspects  relating  to  models  of  high  Tc 
superconductors. 


III.  Vortex  Chaos 


A.  'Chaotic  Trapping'  in  Open  Flows 

This  section  describes  work  in  progress  which  has  been  ongoing  for 
approximately  one  year,  and  which  has  been  done  in  collaboration  with 
E.A.Novikov  of  INLS,  UCSD.  The  original  analytic  foundation  for  the  work 
appears  in  a  short  pre-print  by  Novikov,  and  in  this  continuation  of  the  work  this 
author  has  completed  an  extensive  numerical  study  which  will  shortly  appear  in 
pre-print  form,  and  which  contains  new  and  significant  results  concerning  the 
interaction  of  vortices  with  bluff  bodies  in  open  flows.  Examples  of  the 
numerical  results  appear  in  Appendix  B,  as  well  as  Novikov's  original  pre-print 
to  include  background  discussion  (some  of  these  numerical  results  were  also 
recently  presented  in  a  talk  at  SIAM'S  Dynamical  Systems  conference  in  Orlando, 
May  1990 ). 

The  underlying  model  for  this  investigation  is  that  of  a  bluff  body  (in  this 
case  a  two-dimensional  cylinder)  in  a  uniform  two-dimensional  flow,  with  a 
single  vortex  passing  by  and  interacting  with  the  cylinder.  To  this  system  is 
added  a  small  perturbation  which  consists  of  sinusoidal  vibration  of  the  body 
along  the  flow  direction.  This  system  is  a  simple  and  general  model  for  a 
vortex-like  structure  interacting  with  a  body  in  an  open  flow,  for  which  there 
are  many  physical  analogous;  the  most  obvious  of  these  is  that  of  a  tornado 
interacting  with  structures  such  as  large  buildings.  In  the  case  of  zero 
perturbation,  the  vortex  trajectories,  and  hence  the  topology  of  the  flow  field, 
can  be  solved  exactly  (see  Appendix  B).  However,  for  finite  perturbation  and 
the  proper  parameter  regimes,  Novikov  has  shown  analytically  that  one  can 
generically  expect  chaotic  motion  of  the  vortex  to  emerge.  This  chaotic  motion 
results  in  several  consequences,  the  first  and  most  obvious  of  which  is  that  the 
resulting  vortex  trajectory  is  unpredictable  and  extremely  sensitive  to  initial 
conditions.  Hence  the  direction  of  scatter  of  the  impinging  vortex  is  highly 
erratic  with  respect  to  initila  position.  A  consequence  of  this  is  another 
important  phenomenon,  namely  that  of  'chaotic  trapping'  (a  related  phenomenon 
has  been  introduced  previously  in  gravitational  interactions  by  M.Henon).  This 
phenomenon  consists  of  vortex  motion  for  which  the  vortex  approaching  from 
infinity  can  become  trapped  in  rotational  motion  around  the  body  for  a  (finite) 
period  of  time,  and  then  escape  again.  The  trapping  is  a  result  of  the  vortex 


being  caught  in  the  stochastic  layers  around  the  flow  separatrix,  which  is 
generated  by  the  perturbation.  Because  this  resulting  motion  is  chaotic,  the 
rotational  motion  of  the  trapped  vortex  can  in  turn  result  in  large  pressure 
differences  on  the  boundary  of  the  body.  This  phenomenon  is  apparently  new, 
and  because  the  model  predicts  that  this  is  the  generic  state  for  finite  perturbation 
of  such  systems,  this  phenomenon  should  be  recognizable  in  physical  systems  as 
well.  Speculation  is  that  this  behaviour  may  be  related  to  the  destructive  ability 
of  tornadoes,  as  well  as  that  of  the  highly  dangerous  downdrafts  which 
sometimes  occur  around  airports.  In  addition  to  these  physical  applications,  the 
system  is  also  of  considerable  interest  as  perhaps  the  simplest  model  of  an  open 
flow  system  which  exhibits  such  rich  chaotic  behaviour. 

The  numerical  investigations  of  the  above  system  have  produced  a  large 
body  of  results  which  indicate  an  even  richer  behaviour  than  was  suggested  by 
the  analytic  analysis.  The  first  major  result  was  the  study  of  the  variation  of  the 
topology  of  the  flow  field,  with  zero  perturbation,  as  the  dimensionless 
parameter  a  of  the  dynamical  equation  is  changed  (see  paragraph  5  of  the 
Novikov  preprint).  Roughly,  this  parameter  measures  the  ratio  of  the  vortex  and 
flow  field  strengths.  To  study  the  variation  of  topology,  the  positions  of  the 
stable  and  unstable  stagnation  points  of  the  flow  were  found  for  the  entire  range 
of  a,  using  computer  algebra  and  solving  numerically  for  the  roots.  Once 
distinct  regimes  for  the  root  positions  were  identified,  the  flow  fields  were 
mapped  out  by  integrating  the  dynamical  equations.  Using  this  method,  twelve 
distinct  topologies  were  identified  for  the  unperturbed  flow,  far  richer  than  was 
first  suspected.  These  topologies  are  outlined  and  included  in  Appendix  B.  The 
majority  of  these  topologies  are  capable  of  exhibiting  stochastic  regions,  and 
hence  trapping  phenomena.  In  addition,  Case  3  of  these  topologies  indicates  a 
stagnation  point  whose  character  is  a  mixture  of  hyperbolic  and  elliptic,  which 
may  itself  be  a  new  type  of  structure  in  Hamiltonian  flows. 

The  second  important  result  of  the  numerical  investigations  was  the 
verification  of  the  existence  of  the  chaotic-trapping  phenomenon.  Since  the  size 
of  the  stochastic  region,  and  hence  the  probability  of  trapping,  is  moderately 
dependent  upon  the  frequency  of  the  perturbation,  the  Melnikov  integral  for  the 
system  was  computed  numerically  to  determine  the  frequency  range 
corresponding  to  large  stochastic  regions.  Using  different  values  within  this 
parameter  regime,  a  large  number  of  trajectories  were  found,  by  numerical 
integration,  which  exhibited  trapping  behaviour.  Two  typical  examples  are 


shown  in  Appendix  B.  Trajectories  were  found  which  became  trapped, 
performed  as  many  as  twenty  erratic  revolutions  around  the  body,  and  then 
escaped.  Within  the  proper  parameter  regimes,  the  measure  of  initial  conditions 
resulting  in  trapped  trajectories  seems  to  be  quite  significant,  as  these  trajectories 
were  relatively  easily  found. 

Related  to  the  above  phenomenon,  a  third  interesting  result  was  obtained 
for  this  system.  For  the  case  where  a  stable  and  unstable  stagnation  point  lie 
somewhat  near  the  boundary  of  the  body  and  on  the  same  side,  trajectories  were 
found  where  the  vortex  could  actually  first  become  trapped  around  the  body, 
then  around  the  elliptic  point,  and  then  often  switch  back  and  forth  several  times. 
These  'switching'  trajectories  have  been  observed  for  several  initial  conditions, 
and  seem  to  be  a  somewhat  unusual  and  counter-intuitive  result.  An  example  of 
such  a  switching  trajectory  is  also  shown  in  Appendix  B. 

Finally,  for  a  few  cases  of  different  parameter  values,  Poincare  sections  of 
the  vortex  motion  for  specific  trapping  trajectories  were  taken.  This  was  done 
solely  to  aid  in  gaining  intuition  about  the  structure  and  appearence  of  the 
stochastic  layers  which  cause  the  trapping  phenomenon  itself.  An  example  of  one 
of  these  sections  is  included  in  Appendix  B.  Although  not  of  direct  relevance  to 
the  analysis  discussed  above,  this  type  of  chaotic  analysis  will  be  the  subject  of  a 
future  more  detailed  investigation. 

Future  work  on  this  project  is  planned  to  be  quite  extensive.  Several 
numerical  experiments  are  planned  to  characterize  the  stochastic  nature  of  the 
system,  including  determining  the  measure  of  initial  conditions  which  result  in 
trapping  trajectories  (it  is  suspected  that  this  may  result  in  a  'devil's  staircase' ), 
measuring  time  series  of  boundary  pressure  on  the  body  for  trapping 
trajectories,  and  further  Poincare  analysis.  Beyond  that,  the  system  will  be 
generalized  by  considering  vortex-dipole  interactions  with  the  body,  more 
general  boundary  geometries,  and  alterations  of  the  model  to  make  it  more 
applicable  to  specific  physical  systems. 


B.  Low-Dimensional  Chaos  in  an  Open  Flow  Experiment 


In  the  original  proposal  for  this  work,  it  was  mentioned  that  the  possibility 
existed  of  developing  new  results  for  the  generalized  von  Karman  street  as  a 
better  model  of  the  wake  of  an  open  flow  past  a  bluff  body.  Although  new 
analytic  results  could  not  be  obtained,  an  investigation  was  done  of  some 
experimental  data  of  such  a  flow,  which  seemed  to  exhibit  a  chaotic  nature.  The 
principle  tool  for  the  analysis  was  a  new  technique  for  determining  the  minimum 
embedding  dimension  of  a  chaotic  signal,  developed  in  part  by  the  author  and 
which  is  separately  presented  in  this  document  in  Section  IV.  Using  this  method, 
the  region  of  the  flow  investigated  was  found  to  be  chaotic  with  a  dimensionality 
of  four,  which  also  agrees  with  a  first  order  model  proposed  for  the  system.  A 
summary  of  these  results  are  given  on  page  eight  of  the  pre-print  'Information 
Theoretic  Methods  for  Determining  Minimum  Embedding  Dimensions  of 
Strange  Attractors',  which  is  included  in  Appendix  E.  As  with  the  previous 
section,  this  work  is  still  in  progress,  and  a  more  extensive  and  cooperative 
effort  is  planned  to  identify  low-dimensional  chaos  in  various  aspects  of  these 
flows. 

The  experiment  from  which  the  data  was  obtained  was  performed  by 
M.Gharib  and  K.Lewis  at  the  DARPA/URI  fluid  dynamics  laboratory  at  INLS, 
UCSD.  A  detailed  description  of  the  experiment  will  appear  as  a  pre-print  in  the 
near  future.  Briefly,  two  thin  rotating  cylinders  of  slightly  different  radius  are 
placed  end-to-end  and  perpendicular  to  an  otherwise  uniform  flow  field.  Both 
cylinders  generate  vortices  in  their  downstream  wake,  however  the  mismatch  in 
radii  causes  an  unstable  interaction  which  results  in  low-dimensional  chaos  in  the 
flow  near  the  boundary  of  the  two  regions.  A  photograph  of  the  flow,  supplied 
by  K.Lewis,  for  a  typical  experimental  run  is  shown  in  Appendix  C,  as  are  a 
time  series  and  FFT  for  the  chaotic  region. 

The  MDL  technique  for  measuring  minimum  embedding  dimension  was 
used  on  a  set  of  several  different  time  series  from  these  experiments,  in  an 
attempt  to  determine  whether  this  flow  was  indeed  low-dimensional  chaos  (this 
method  is  described  more  fully  in  Section  IV).  It  should  be  noted  here  that  the 
conventional  method  for  determining  embedding  dimension,  ie.  the  Grassberger- 
Procaccia  algorithm,  yielded  inconsistent  results  for  this  particular  system,  and 
this  was  one  of  the  primary  motivations  for  the  use  of  the  new  MDL  algorithm 


to  analyze  this  data.  After  considerable  analysis,  it  was  shown  that  this  flow 
typically  seemed  to  have  a  dimensionality  of  four,  although  the  dimension  could 
be  as  high  as  six  for  some  parameter  regimes.  A  plot  of  the  MDL  function  for  a 
typical  data  series  is  also  shown  in  Appendix  C,  showing  the  minimum  of  the 
MDL  function  occuring  at  a  dimension  of  four.  Since  Gharib  and  Lewis  have 
proposed  a  model  for  the  chaotic  region  based  on  coupled  duffing  oscillators, 
which  should  also  have  an  expected  diemnsionality  of  four,  these  results  seemed 
to  confirm  this  conclusion. 

Although  work  on  this  project  is  still  ongoing  and  results  are  somewhat 
preliminary,  this  investigation  could  prove  very  significant  as  an  excellent 
example  of  low-dimensional  chaos  in  a  flow  which  can  be  demonstrated 
experimentally,  analytically,  and  through  nonlinear  time  series  analysis.  Future 
work  is  also  planned,  in  cooperation  with  researchers  in  the  DARPA  fluids  lab, 
which  will  involve  a  similar  analysis  in  an  attempt  to  look  for  low-dimensional 
chaos  in  the  velocity  and  acoustic  fields  of  a  submerged  jet. 


IV.  Measurement  and  Prediction  in  Chaotic  Systems 

Because  of  the  authors  interest  in  investigating  chaos  in  vortex  flows  and 
in  fluid  flows  in  general,  there  was  considerable  motivation  during  this  project  to 
study  the  various  methods  for  measuring,  characterizing,  and  predicting  chaotic 
behaviour  in  flows,  in  particular  those  resulting  in  time  series  from  actual 
physical  systems.  It  was  generally  found  that,  although  idealized  methods  for 
analyzing  chaotic  motions  were  well  established,  application  of  these  techniques 
to  real  physical  data,  which  often  include  significant  levels  of  noise,  was  often 
very  poor.  A  significant  amount  of  effort  during  this  funding  period  was 
therefore  devoted  tc  investigating  and  developing  improvements  in  these 
techniques,  which  would  be  more  robust  to  the  problems  associated  with  real 
data  (ie.  noise,  short  data  sets,  irregular  sampling  of  data,  etc. ).  These  new 
techniques  were  investigated  with  an  eye  towards  increasing  the  analytic  abilities 
for  experimental  data,  especially  that  being  generated  at  the  DARPA  UCSD 
fluids  laboratory.  This  work  during  the  past  1  1/2  years  has  resulted  in  two  new 
and  powerful  methods  for  the  analysis  of  chaotic  data  from  actual  physical 
systems,  and  has  produced  two  published  papers  and  one  paper  currently  in 
review.  These  two  methods  are  outlined  below. 

A.  Global  Prediction  for  Dissipative,  Chaotic  Systems 

Most  methods  of  analysis  for  chaotic  systems  involve  the 
measurement  of  several  physical  properties  which  are  known  to  indicate  chaotic 
motion,  such  as  the  dimension,  the  lyapunov  spectrum,  etc.  Once  these  quantities 
are  determined,  the  obvious  next  step  should  be  to  use  these  quantifiers  for 
model  selection,  signal  processing,  prediction,  and  the  like.  Surprisingly,  only  a 
handful  of  papers  exist  which  attempt  to  develop  any  of  these  applications,  and 
very  few  serious  efforts  to  analyze  actual  data  have  been  done.  The  work 
described  below,  done  in  collaboration  with  H.D.I.Abarbanel  and  R.Brown  of 
INLS,  UCSD,  was  an  attempt  to  develop  a  general  method  for  modelling  a 
chaotic  systems'  flow  in  phase  space  by  simultaneously  utilizing  as  much 
dynamical  information  as  possible,  in  the  most  efficient  way,  and  then  utilizing 
this  information  for  prediction.  Prediction  in  this  sense  means  the  generalized 
forward  extrapolation  of  short  segments  of  phase  space  trajectories,  which  can 
then  be  used  for  actual  prediction  or  for  signal  processing,  noise  reduction,  etc. 
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The  results  of  this  investigation  grew  into  a  somewhat  extensive  general 
methodology,  which  is  described  in  the  two  papers  included  here  as  Appendix  D. 

To  give  a  brief  overview  of  the  method,  the  basic  idea  starts  by 
reconstructing  a  chaotic  systems'  attractor  in  phase  space  from  a  time  series 
using  time-delay  embedding  and  the  standard  methods  to  determine  the 
embedding  dimension  and  the  autocorrelation  time.  The  general  procedure  is  to 
model  the  local  flow  on  the  attractor  by  a  mapping  function  which  uses  the 
information  of  where  nearby  neighboring  points  on  the  attractor  are  mapped  to. 
This  function  can  then  be  used  to  predict  where  a  new  point  on  the  attractor  will 
evolve.  In  the  formulation  and  the  numerical  algorithms  which  were  developed, 
a  general  form  of  the  mapping  function  is  used  with  the  important  properties  that 
it  is  global,  in  the  sense  that  only  one  function  is  used  over  the  entire  attractor, 
and  secondly  that  the  function  is  well  defined  analytically,  so  that  for  example 
gradients  can  be  computed  and  local  derivatives  can  be  used  for  parameter 
fitting.  Specific  information  about  the  given  dynamical  system  is  then  built  into 
the  mapping  function  in  the  following  way:  the  mapping  function  contains 
parameters  which  weight  the  way  local  information  on  the  attractor  is  used,  such 
as  the  number  of  neighbors  to  include,  the  number  of  previous  iterates  to 
include,  the  length  scale  involved,  etc.  The  correct  parameters  that  globally 
describe  the  particular  attractor  are  then  found  by  nonlinear  least-squares  fitting 
of  the  function  to  the  data  set  taken  from  the  dynamical  system.  The  correct 
parameters  are  then  chosen  as  those  which  optimally  reproduce  the  given  data  set 
from  the  system. 

Perhaps  the  most  important  aspect  of  this  method  is  the  inclusion  of 
additional  dynamical  information  about  the  system  in  the  mapping  function.  In 
addition  to  the  dimension,  there  also  exist  other  important  quantifiers  of  the 
character  of  the  chaotic  motion  which  can  be  extracted  from  the  time  series  data. 


These  include  the  lyapunov  spectrum  of  exponents,  and  the  probability  density  of 
orbits  on  the  attractor.  These  quantities  are  measured  from  the  time  series  using 
standard  algorithms,  and  are  then  also  built  into  the  mapping  function  in  the 
following  manner:  the  functional  values  that  these  quantifiers  should  yield  for  the 


given  mapping  are  derived  from  it  by  taking  gradients  and  using  appropriate 


definitions.  When  the  nonlinear  least-squares  parameter  fitting  is  being  done,  the 


map  is  simultaneously  constrained  to  also  reproduce  the  appropriate  values  of  the 


dynamical  quantifiers.  Examples  of  the  results  of  this  type  of  constrained 


optimization  are  given  in  both  papers  in  Appendix  D.  The  point  of  this  is  that 


the  resulting  map  now  reproduces  not  only  the  data  to  some  accuracy,  but  also 
reproduces  the  correct  dynamical  characteristics  of  the  system.  It  is  therefore 
expected  that  this  constrained  fit  for  the  function  will  prove  to  be  a  more 
accurate  predictor  of  the  actual  motion  of  the  system. 

Extensive  testing  and  development  of  this  idea  have  generally  shown  that 
this  type  of  constrained  parameter  fitting  seems  to  produce  considerably  superior 
predictive  power  than  that  of  other  less  sophisticated  techniques.  Results  of  this 
work  have  generally  shown  (see  Appendix  D)  that  it  is  possible  to  find  global 
mapping  functions  which  reproduce  time  series  data  to  an  excellent  degree  ( 

0.5%  average  rms  error  or  better )  and  which  are  also  capable  of  accurately 
reproducing  the  dynamical  quantifiers,  and  hence  have  the  same  chaotic 
invariants,  as  the  original  system.  Additionally,  numerical  experiments  with  the 
reproduction  of  known  trajectories  have  shown  that  this  type  of  predictor  can 
accurately  predict  orbits  significantly  farther  than  the  majority  of  other  known 
methods,  with  shadowing  trajectories  for  example  staying  close  to  the  original 
Henon  trajectories  for  as  long  as  seven  or  eight  iterations.  In  addition,  this 
method  seems  better  suited  than  most  methods  to  actual  experimental  data,  which 
may  include  significant  noise  components,  because  of  the  inherent  averaging  of 
the  local  phase  space  flow  which  occurs  via  the  mapping  function. 

Although  much  of  the  work  on  this  technique  is  completed,  it  has  yet  to 
see  widespread  use  for  time  series  of  actual  physical  systems.  Therefore,  future 
plans  for  this  project  involve  primarily  identifying  systems  for  which  this 
technique  may  prove  useful,  and  for  which  more  practical  experience  can  be 
gained.  Actual  improvements  to  the  method  will  involve  investigations  of 
different  types  of  general  mapping  functions,  which  could  improve  accuracy,  and 
also  of  different  algorithms  for  searching  data  sets  for  nearest  neighbors,  which 
currently  consumes  the  majority  of  the  computational  resources  associated  with 
the  calculafions. 


B.  Information  Theoretic  Methods  for  Determining  Dimension 


The  second  major  pro;  'hich  was  undertaken  to  develop  new 
methods  for  chaotic  time  series  analysis  was  the  attempt  to  develop  a  method  for 
determining  the  embedding  dimension  of  a  chaotic  attractor.  Determining  the 
correct  embedding  dimension  is  the  first  necessary  prerequisite  for  performing 
time-delay  reconstruction  of  the  attractor  from  a  time  series.  The  standard,  most 
successful  method  for  determining  the  embedding  dimension  is  the  Grassberger- 
Procaccia  algorithm,  but  this  algorithm  has  several  well  known  difficulties, 
including  ambiguity  in  determining  the  minimum  embedding  dimension  that  can 
be  used,  as  well  as  sensitivity  to  noise,  and  often  quite  substantial  data 
requirements.  The  new  method,  which  was  developed  in  collaboration  with 
H.D.I.Abarbanel  of  INLS,  is  based  on  a  result  from  information  theory,  and  has 
the  advantages  of  unambiguously  determining  the  minimum  embedding 
dimension  allowable,  as  well  as  being  much  more  robust  to  noise  and  requiring 
less  data.  The  background  and  results  of  this  new  method  are  presented  in  the 
pre-print  included  in  Appendix  E,  which  is  currently  in  review. 

The  central  tool  of  the  new  method  is  a  result  from  information  theory 
which  was  developed  over  some  time  by  Aikaike,  Wax,  Kailath,  and  others.  This 
result  is  the  definition  of  a  function,  called  the  Minimum  Description  Length 
(MDL)  function,  which  quantitatively  weighs  a  functional  which  is  essentially  the 
maximum  likelihood  fit  to  a  data  set,  versus  a  measure  of  the  complexity  of  the 
model  used  to  generate  the  fit.  In  simpler  terms,  it  weighs  the  trade-off  between 
a  data  model  being  a  better  fit  to  the  data,  versus  how  complex  (ie.  how  many 
parameters)  the  model  has.  For  a  class  of  fitting  functions  whose  dimensionality 
is  a  variable,  the  MDL  function  can  be  proven  to  take  a  minimum  at  the  minimal 
number  of  dimensions  necessary  to  describe  the  data.  The  principal  result  of  the 
paper  in  Appendix  E  is  the  adaptation  of  this  function  for  time  series  of  real  data 
and  for  different  normalizations  and  parameter  counting,  and  is  now  called  the 
Data  Description  Length  (DDL)  function.  This  paper  demonstrates  how  the 
DDL  function  can  be  used  to  determine  unambiguously  the  minimum  dimension 
necessary  to  embed  an  attractor  from  a  time  series  of  data. 

The  actual  algorithm  which  was  developed  works  by  first  constructing  a 
data  matrix  from  the  time  series,  and  then  calculating  its  eigenvalues  by  singular 
value  decomposition.  These  eigenvalues  are  used  in  the  determination  of  the 


maximum  likelihood  fit  to  the  data.  For  the  present  method,  a  guassian 
distribution  of  the  reconstructed  attractor  is  assumed,  which  of  course  is 
inaccurate  for  most  attractors.  The  central  point  here  is  that  although  our 
assumption  is  crude,  the  question  being  asked  is  also  rough,  ie.  how  many 
independent  dimensions  is  the  attractor  distributed  along.  By  assuming  a 
guassian  distribution,  the  specific  form  of  the  DDL  function  can  be  written 
down,  and  from  this  formula  the  numerical  value  of  the  DDL  function  can  be 
calculated  for  each  value  of  the  embedding  dimension.  The  dimension  for  which 
the  DDL  function  takes  a  minimum  is  the  dimension  which  best  describes  the 
data,  under  the  constraints  of  the  particular  distribution  chosen.  The  algorithm 
has  been  developed  to  the  point  where  one  simply  supplies  the  time  series  as  data, 
adjusts  a  few  parameters  to  the  proper  regimes,  and  then  picks  off  the  minimum 
of  the  resulting  plot  of  the  DDL  function  to  find  the  correct  embedding 
dimension. 

Results  of  experimentation  with  the  DDL  algorithm  (see  Appendix  E) 
have  shown  the  technique  in  most  cases  to  unambiguously  yield  the  correct 
embedding  dimension  for  test  chaotic  systems.  In  addition,  and  quite 
importantly,  the  method  works  in  the  presence  of  significant  amounts  of  noise 
(up  to  15-20  dB  SNR  ),  and  also  requires  far  less  data  and  computational 
resources  than  the  Grassberger-Procaccia  algorithm.  This  makes  the  method 
particularly  well  suited  to  analyzing  experimental  time  series.  There  are  still 
some  difficulties,  however,  with  determining  the  dimension  of  attractors  whose 
topology  is  not  simple  (ie.  multiple  lobes  or  interleaving).  This  problem  is 
almost  certainly  related  to  the  choice  of  a  gaussian  distribution  for  the 
underlying  maximum  likelihood  fit,  as  discussed  above. 

Although  no  immediate  work  is  planned  on  this  project,  future  efforts  will 
have  to  address  the  problem  of  formulating  the  DDL  functional  form  based  on 
the  inclusion  of  higher  order  terms  in  the  underlying  maximum  likelihood  fit. 
This,  unfortunately,  has  already  proven  to  be  a  difficult  task.  Finally,  further 
application  to  known  systems  will  be  necessary  to  gain  additional  practical 
exprience. 


V.  Summary 


Research  performed  during  this  funding  period  has  included:  an  improved 
formulation  for  the  energy  density  of  an  infinite  lattice  of  vortex-like  objects, 
and  an  investigation  of  all  possible  geometries  of  the  two-component,  triangular 
vortex  lattice  and  its  energy  of  slip  displacement;  an  extensive  numerical 
investigation  of  the  properties  of  an  important  new  model  of  the  interaction  of  a 
vortex  with  a  bluff  body  in  an  open  flow,  which  exhibits  a  'chaotic  trapping' 
phenomenon;  identification  of  low-dimensional  chaos  in  a  vortex  interaction 
experiment  in  open  flows,  utilizing  a  new  method  for  determining  minimum 
embedding  dimension;  an  extensive  new  method  for  using  time  series  of  data 
from  chaotic,  dissipative  systems  to  do  system  identification  and  prediction;  and 
finally  development  of  a  new  method  for  computing  the  minimum  embedding 
dimension  from  a  time  series,  used  in  the  vortex  experiment  above,  which  is 
unambiguous,  requires  less  data,  and  is  more  robust  to  noise  than  conventional 
techniques. 

All  of  the  above  projects  have  resulted  in  significant  new  results  regarding 
either  vortices  or  vortex  dominated  flows;  the  determination  or  description  of 
chaos  in  systems;  or  both  aspects;  and  have  culminated  in  three  published  papers, 
two  papers  currently  in  review,  and  one  pre-print  currently  in  preparation. 
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An  expression  is  derived  for  the  energy  density  of  a  lattice  of  point  vortices  (or  other  logarithmic 
objects)  having  an  arbitrary  number  of  vortices  of  arbitrary  strengths  in  an  arbitrary  unit  cell.  The 
result  is  expressed  in  the  form  of  a  rapidly  convergent  series  well  suited' for  numerical  evaluation. 
The  effects  of  separately  changing  the  shape  and  dimensions  of  the  unit  cell  are  shown  for  simple 
cases,  and  the  energy  of  the  triangular  lattice  is  calculated  as  a  function  of  slip  displacement. 


I.  INTRODUCTION 

We  consider  the  problem  of  finding  the  energy  of  an 
infinite  number  of  classical  point  particles  confined  to  a 
planar  lattice  and  interacting  pair  wise  with  a  logarithmic 
potential.  These  particles  will  be  viewed  as  vortices  in  an 
Eulerian  fluid;  they  are  also  equivalent  to  rectilinear  line 
charges,  line  currents,  or  screw  dislocations.  Our  objec¬ 
tive  is  to  find  the  relative  energy  of  different 
configurations  of  J  vortices  having  strengths 

r„r, . Vj  in  a  unit  cell  defined  by  the  lengths  L{ 

and  Lz  of  its  sides  and  the  angle  <f>  between  them. 

If  the  sum  of  the  vorticity  strengths  is  not  zero  in  the 
unit  cell  the  system  is  stationary  only  in’  a  coordinate 
frame  rotating  with  angular  velocity  ft, 


of  this  expression  it  becomes  easy  to  compare  the  energy 
of  all  possible  lattices  containing  the  same  mixture  of  vor¬ 
tex  species. 

II.  LATTICE  ENERGY 

The  total  energy  due  to  mutual  vortex  interaction  is 

ET=—7-22'rirjln\Ti-rJ\,  (1) 

w  I  j 

where  d  is  the  fluid  density  (mass  per  unit  area)  and  the 
double  sum  omits  /  —  j.  For  an  infinite  lattice  Er  is  un¬ 
bounded,  even  in  the  presence  of  a  background.  However 
this  unboundedness  is  easily  avoided  by  considering  the 
energy  per  vortex  E,  which  is  finite: 


r  1 

n=TTT-  ■’T'  r=2  r\  . 

We  consider  the  lattice  only  in  such  a  frame  or, 
equivalently,  in  a  nonrotating  frame  with  an  imposed 
background  solid-body  rotation  of  the  opposite  sign. 
—fir.  Similarly,  an  opposite  uniform  background  charge 
or  current  would  be  needed  for  line  charges  or  currents. 
Such  constant  background  fields  play  no  role  in  the  lat¬ 
tice  properties,  and  serve  merely  to  cancel  formal1  singu¬ 
larities  that  occur  at  zero  wave  number.  Of  course,  these 
background  fields  must  be  explicitly  included  to  study  the 
global  properties  of  finite2  systems. 

The  task  of  deriving  lattice  sums  for  Coulomb  interac¬ 
tions  has  a  long  history.3  Our  purpose  here  is  to  obtain 
the  most  efficient  lattice  sum  for  a  general  two- 
dimensional  lattice  and  our  method  based  on  results  by 
Glasser,4  who  considered  the  particular  case  of  a  rec¬ 
tangular  unit  cell  (0  =  90°).  In  addition  to  obtaining  a 
rapidly  convergent  lattice  summation,  we  obtain  an  ex¬ 
pression  for  the  energy  density  of  a  vortex  lattice  that  is 
invariant  to  physically  equivalent  designations  of  the  unit 
cell,  which -are  not  necessarily  primitive  cells.  By  means 


E  —  lim 


47T 


a/— oo  dJM 


-T  > 


(2) 


where  Af  is  the  number  of  unit  cells.  It  is  convenient  to 
subdivide  the  sum  over  all  vortices  into  sums  over  the  J 
vortex  species  in  all  unit  cells, 

2=22+-+2  «) 

i  h  h  Jj 


and  to  note  that 


22'raI>|r,o— r^|=Mrar02'ln|r°— r°0+Ln\  .  (4) 
4  h  n 


The  sum  in  the  above  equation  is  over  all  integers 
n,,n2  =  0,±  1,±2, ...,  except  if  a=P,  in  which  case 
«,  =  rc2=0  must  be  omitted.  The  vortex  positions  are  r°, 
a=  1, . . .  ,/in  a  reference  unit  cell  and 

Ln=Linicl+L2n2c2  (5) 

is  a  generic  lattice  vector  (e,-e2=cos0).  Using  Eqs.  (3) 
and  (4)  in  Eq.  (2)  gives 
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£  =  -r22'In|Lj-4  s’  2  ry^xinlr^+Lj 

n  J  a“  I  it 


■  psimft 


2tri(m,i)>|  +m3y2) 


2n  +m\p1-2m^m2pcos<f> 


where 


l  J 

r=-  y  r 


j  in 

J  <x=»l 


r0„=r°-r0 
ro/i  1 


III.  LATTICE  SUM 

To  evaluate  the  lattice  sums  in  Eq.  (6)  express  the 
Fourier  transform  of  the  logarithm  function  using  box 
normalization, 

,  I  | 2i t  —  exp(/k*x)  ,  -  ”i  n2 

ln| x |  =  lim - 2  —rrr~r>  k~2rr  — , —  , 

M-o s,s2?  k2+ p}  J|  s2 


in  the  limit  where  s{  and  s2  become  infinite.  A  nonzero 
“mass"  parameter  p  changes  the  logarithm  function  into 
a  short-ranged  one,  and  is  a  fundamental  parameter  for 
understanding  the  effect  of  the  background.  To  perform 
the  lattice  sum  it  is  convenient  to  employ  the  so-called 
reciprocal-lattice  vectors  g  defined  by 


2; r  m,  >n-< 

I7V|+ ifVi  ’  ve,-yw 


Then, 


-2ln|x  +  Lj  =  FU)+cm  , 


where 


V(X ) _ ■?■£ —  y  *01 

£.,£.2sin0  g(-^0)  g*  ’ 


c„  =  lim  — -  . 

n— o  [iJ-LlLzs\n<f> 


The  divergent  constant  c„  corresponds  to  the  g=0  com¬ 
ponent  (m\  =m2=0);  the  effect  of  the  background  is  to 
cancel  this  divergent  constant. 

To  apply  Glasser’s  method  one  first  writes  Eq.  (1 1)  in  a 
more  explicit  form, 


where  yj  —  (X|Sin^— x2cos^)/Z,,sin0  and  yi~x2/ 
L2sin^.  The  same  sequence  of  transformations  of  Ref.  4 
then  leads  to 

F(x)=(sin <f>/pn)'2i  cos(&z2/sin<0)/A:2 

Jc  =  l 


■jin  [j  h(s,z{tz2) , 


where 


n (s,z1,z2)— 1  — 2e  cos  z,-l - cos d> 

P 

+  e-2|z2+2«Si„*|/P(  (j5) 

with  z,  ~2nXj /Li  and p=L,/L2. 

In  terms  of  these  new  variables  a  lattice  translation 
x— »x  +  L„  becomes  z,— *zx  +2nnl  +  2n7i2cos$/p  and 
z2-*z2  +  2Trn2sini.  It  is  easy  to  verify  that  Eq.  (14)  is  in¬ 
variant  under  lattice  translations,  consistent  with  Eq. 
(10).  The  first  summation  of  Eq.  (14)  can  be  performed, 
giving  the  more  efficient  representation,5 
00 

(sin^/rr)  £  cos(kz2/s\n<f>)/k2 

Jt =i 

=  lz2!(|z2|/sin<£— 27r)/47r+7r(sin0)/6  ,  (16) 

valid  for  |z2|  527rsin <f>.  The  consequent  loss  of  transla¬ 
tional  invariance  in  the  e2  direction  causes  no  difficulty  in 
numerical  evaluations. 

The  expression  for  V{x)  given  by  Eqs.  (14)— (16)  con¬ 
verges  quite  rapidly.  In  practice,  the  evaluation  of  the 
infinite  product  of  terms  h  (s,z,,z2)  reduces  to  the  multi¬ 
plication  of  about  four  to  eight  terms  because,  for  large 
integers  s,  /t(s,z,,z2)  is  dominated  by  unity  plus  terms 
proportional  to  exp(  ~s ),  which  have  a  very  fast  decay. 
This  product  expansion  is  almost  identical  to  the  expan¬ 
sion  for  the  Jacobi  ©  functions;  for  the  special  case  of 
<f>=90°  studied  by  Glasser,  it  reduces  to  them.  Finally,  an 
expression  is  needed  for  the  first  term  of  Eq.  (6),  which  is 
the  energy  of  identical  vortices  on  a  primitive  lattice,  a 
result  derived  also  by  Tkachenko.6  This  term  is 
equivalent  to  the  following  limit: 

2'ln|Lj  =  lim  fyinlx+L-j  — lnlxj  1  ,  (17) 

n  IT  “  j 

Performing  this  limit  on  Eq.  (14)  gives 


00 

-2'ln|Lfl|  =  -^-sin^-ln(27r/LI)-ln  JJ  { 1 -2e"2”,sin*1/Pcos[27rs(cos^)/p]+e~4”,sin*l//’}  . 


Now  we  scale  the  energy  to  obtain  equal  energies  for  physically  equivalent  lattices.  This  is  simply  done  by  noting 
that  E  in  Eq.  (6)  is  the  energy  per  vortex.  Hence,  scaling  the  lengths  L,  and  L2  by  a  constant  a  gives  the  correct  nor¬ 
malized  energy  and  renders  a  constant  vortex  density.  This  causes  no  changes  to  the  ratio  /L2,  but  in  Eq.  (18)  the  di- 
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where  zu/  =  2tr(r°— r?)-£ /£,,,  z2  (/- =  2rr(r,)— r°)\y /L2, 
and  /i(5,Z|,z2)  is  defined  in  Eq.  (15).  This  expression  for 
E  gives  the  re'ative  energy  density  of  lattices  containing 
fixed  ratios  of  vortex  species  having  fixed  strengths.  To 
compare  the  energies  of  lattices  which  do  not  have  the 
same  mixtures  of  vortices  requires  assumptions  or  physi¬ 
cal  information  about  the  vortex  self-energies. 

What  makes  Eq.  (21)  useful  for  numerical  evaluation  is 
the  fast  convergence  of  the  function  ,’i  (s.z,  ,z2 ).  Some  ap¬ 
plications,  not  discussed  here,  require  calculating  the  par¬ 
tial  derivatives  of  E ,  for  which  it  is  convenient  to  change 
the  unit-cell  variables  p  and  #  to  cr  =  2jr(sin<A)/p  and 
X~2n(cos<b)/p. 


IV.  EXAMPLES 

The  foregoing  results  will  now  be  applied  to  some  sim¬ 
ple  examples.  First,  consider  the  change  of  lattice  energy 
density  induced  by  varying  the  angle  #  between  the  lat¬ 
tice  generators  while  holding  fixed  the  lengths  of  the  unit 
cell  and  the  relative  positions  of  the  vortices.  Three  cases 
will  be  considered:  (a)  one  vortex  per  unit  cell  with 
L,—L2;  l b)  two  vortices  per  unit  cell,  also  with  Lt  =L2; 
and  finally  (c)  two  vortices  per  unit  cell  with 
L\=L1/V‘i,  The  results  as  calculated  from  Eq.  (21)  are 
shown  in  Fig.  1.  The  triangular  lattice  occurs  for  (a) 
when  (£=60°  and  120°  and  for  (c)  when  #=90°.  The 


FIG.  1.  Effect  of  varying  the  angle  #  between  the  unit-cell 
generators  for  fixed  unit-cell  lengths.  The  different  unit  cells  are 
illustrated  for  <£=90°.  (a)  One  vortex  per  unit  cell  with  Lx  =Lt. 
(b)  Two  vortices  at  positions  (0,0)  and  (0.5, 0.5)  with  respect  to 
the  unit-cell  lengths,  L,  =L2.  (c)  Two  vortices  at  positions  (0,0) 
and  (O.v/3/2)  in  a  unit  cell  with  L1=LJ/v' 3=1.  The  energy 
density  is  the  energy  per  vortex  in  units  of  dr2/4n,  where  d  is 
the  fluid  density  and  T  is  the  unit  of  circulation. 


t2/  L\ 

FIG.  2.  Effect  of  changing  the  aspect  ratio  L2/L\  for  fixed 
angle  #.  The  various  unit  cells  are  illustrated  for  L2/Lx  =  \, 
with  the  vortices  associated  with  the  unit  cell  indicated  by  solid 
circles,  (a)  One  vortex  with  #=90*.  (b)  Two  vortices  with 
#=90°.  (c)  Two  vortices  with  #=60*.  (d)  Two  vortices  with 
#=45*. 
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square  lattice  occurs  for  both  (a)  and  (b)  at  <£=90°.  The 
energy  densities  of  the  triangular  and  square  lattices  are 
-1.321  1174284  and  -1.3105329259,  respectively. 
(Earlier  evaluations  of  the  energies  of  these  simple  lattices 
are  equivalent  within  constants.6,2)  Although  curve  (b) 
has  a  minimum  at  <t>= 90°  this  is  a  constrained  minimum 
and  does  not  result  in  a  stable  lattice;  indeed,  it  joins 
curve  (a)  which  leads  to  the  absolute  minimum. 

Next,  the  angle  0  is  constrained  and  the  ratio  of  unit¬ 
cell  lengths  L2/L\  is  varied.  These  results  are  shown  in 
Fig.  2,  where  the  various  cases  are  (a)  one  vortex  and  (b) 
two  vortices  per  unit  cell  with  0=90°,  (c)  two  vortices 
with  0=60’,  and  (d)  two  vortices  with  0=45°.  Only 
curve  (b)  achieves  the_trianguiar  lattice.  This  occurs  at 
L1/L\=V  3  and  l/v'2.  Note  that  the  horizontal  scale  is 
logarithmic,  to  illustrate  the  symmetry  around  L2/L , 
=  1.  It  appears  that  curve  (d)  may  also  reach  the  low  en¬ 
ergy  of  the  triangular  lattice.  In  fact,  it  does  not,  nor  is 
the  minimum  it  does  reach  an  unconstrained  minimum  of 
the  lattice.  Also  despite  appearances,  curves  (b),  (c),  and 
(d)  do  not  mutually  intersect. 

Finally,  the  slip  strength  of  the  triangular  vortex  lattice 
is  calculated  for  displacements  along  one  of  the  principal 
axis  directions.  That  is,  the  energy  density  is  evaluated  as 
a  function  of  a  rigid  displacement,  through  one  lattice 
spacing,  of  a  number  n  of  lattice  rows  with  respect  to  the 
same  number  of  fixed  rows.  The  pattern  repeats,  of 
course,  to  infinity.  During  this  displacement  the  unit-cell 
dimensions  and  angle  are  held  fixed  so,  in  particular, 
there  is  no  change  in  volume.  Figure  3  shows  the  results 
for  various  n,  as  labeled.  Obviously,  the  maximum 
occurs  for  a  displacement  halfway  between  equilibrium 
positions  and  is  largest  for  alternating  single  rows 
In  =  1).  This  energy  is  just  that  of  a  rectangular  lattice 
with  L2/L ,  =(v'3/2)±1=(0.866)±1,  which  can  be 
verified  by  comparing  the  maximum  in  Fig.  3  with  curve 
(a)  in  Fig.  2  at  that  ratio.  The  curves  are  approximately 
related  to  each  other  by 

nj[Ej(d)-Et]**nk[Ekld)-E,]  ,  (22) 

where  E,  is  the  triangular  lattice  energy  density  (given 
above)  and  d  is  the  displacement.  Future  publications 
will  treat  other  applications,  especially  those  that  involve 
seeking  minima  of  the  energy  density  in  the  presence  of 
additional  dynamics,  mixtures  of  vortex  strengths,  and 
the  unconstrained  space  of  lattice  variables.7 

V.  CONCLUSION 

Lattices  of  nonneutral  vortices,  like  charges,  have  a 
long-range  interaction  which  leads  to  a  formal  singularity 


FIG.  3.  Slip  strength  of  triangular  vortex  lattice  for  the  rigid 
displacement  of  n  rows  of  vortices  with  respect  to  n  stationary 
rows  in  each  unit  cell.  The  curves  are  labeled  by  n  and  the 
range  of  displacement  is  one  lattice  spacing  along  a  principal 
axis. 


when  the  lattice  energy  is  calculated  as  the  N -*  oo  limit 
of  a  finite  system.  By  the  same  method  as  used  for 
charges,  this  singularity  can  be  removed  by  adding  a  neu¬ 
tralizing  background.  For  vortices,  this  background  is 
taken  to  be  uniform,  with  the  result  that  there  is  no 
phenomenon  of  screening.  Also,  like  charges,  the  field 
for  each  vortex  leads  to  a  formal  singularity  in  the  self¬ 
energy  in  the  limit  of  vanishing  core  size.  This  singulari¬ 
ty,  too,  is  irrelevant,  except  that  it  prevents,  in  the  ab¬ 
sence  of  further  assumptions  or  physical  information,  a 
comparison  of  the  lattice  energies  of  vortex  systems  con¬ 
taining  different  mixtures  of  vortex  strengths. 

The  energy  density  of  the  general  vortex  lattice  (arbi¬ 
trary  unit  cell  and  arbitrary  number,  magnitudes  and 
signs  of  strengths  of  vortices  per  unit  cell)  is  given  by  Eq. 
(21),  which  has  the  virtue  of  being  easily  evaluated  nu¬ 
merically,  in  the  sense  of  rapid  convergence  of  its  infinite 
products.  This  expression  provides  a  new,  practical  tool 
for  studying  a  wide  range  of  vortex  lattice  problems. 
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ABSTRACT 


It  is  shown,  by  using  the  Poincare-Melnikov-Amold  method,  that  the  motion  of  a  linear 
vortex  in  the  flow  past  a  cylindrical  body  is  chaotic.  In  particular,  a  vortex  can  be  captured 
by  the  body  and  then,  after  some  complicated  rotation  t'lCQ.r  the  body,  can  be  lost.  More 
general  problems  of  vortex-body  interaction  are  discussed  qualitatively.  Possible  applications  of 
the  theory  are  indicated. 


The  study  of  the  chaotic  interactions  of  linear  vortices  lias  a  twelve  year  history  with  the 
participation  of  many  authors  (see  Ref.  1-15  and  references  therein).  Generalization  to  the 
axisymmetric  flows  have  been  indicated  in  Ref.  16.  The  main  conclusion  from  these  studies  is 
that  two-dimensional  and  axisymmetrical  flows  of  ideal  incompressible  fluid  are  generally  non- 
integrable  and  with  appropriate  initial  conditions  exhibit  chaos.  This  is  in  contrast  with  the 
opinion  which  prevailed  earlier  among  the  advocates  of  the  soliton  approach  to  the  hydrody¬ 
namics.  The  chaotic  motion  of  vortices  has  various  applications,  in  particular,  to  the  problem 
of  weather  prediction..1,17 

The  main  goal  of  this  note  is  to  indicate  novel  features  of  chaotic  motion  which  arise  in  the 
presence  of  a  moving  body.  Firstly,  it  is  enough  to  have  only  one  vortex  in  order  to  get  chaotic 
motion.  Secondly,  the  mechanism  of  generation  of  chaos  is  very  transparent  in  the  vortex-body 
system.  Thirdly,  we  get  new  phenomena  —  chaotic  capture  —  loss  of  vortex  by  a  moving  body. 

Vortex-body  interactions  are  important  in  many  situations.  The  most  dramatic  examples  are 
.  the  aviacatastrophies,  caused  by  a  vortex  initiated  by  downdraft  of  cold  air,18  and  the  destruction 
of  buildings  by  tornado.  We  will  start  with  an  analytical  description  of  the  motion  of  a  linear 
vortex  in  the  flow  past  a  circular  cylinder.  Then  we  will  make  some  qualitative  remarks  about 
more  general  problems. 

In  the  frame  of  reference  moving  with  the  cylinder,  the  velocity  of  a  linear  vortex  in  an  ideal  ' 
fluid  is  a  Hamiltonian  superposition  of  two  parts  of  motion.  The  first  part  corresponds  to  the 
potential  motion  of  fluid  relative  to  the  cylinder  (see,  for  example,  Ref.  19),  such  as  if  the  vortex 
has  zero  intensity.  The  second  part  of  the  motion  is  induced  by  the  interaction  of  vortex  with 
cylinder.  In  the  case  of  circular  cylinder,  this  motion  is  induced  by  an  image  vortex,  placed 
inside  the  cylinder  at  the  distance  from  the  center  r  =  a2/r,  where  a  is  the  radius  of  cylinder 
and  r  corresponds  to  the  position  of  vortex.  Both  parts  of  motion  have  zero  normal  components 
of  velocity  at  the  surface  of  the  cylinder. 

We  will  scale  distances  by  a  and  time  by  a/uo,  where  u0  is  the  characteristic  fluid  velocity 
,  at  infinity.  If  the  fluid  velocity  at  infinity  is  constant,  then  the  problem  is  characterized  by  only 
one  nondimensional  parameter  'a  =  K/2xauQ)  where  k  is  the  vortex  intensity.  In  the  case  of 
vibration  of  cylinder,  we  have  the  relative  fluid  velocity  at  infinity 


u(t)  =  u0(l  +  esincuf),  (1) 

where  e  and  u>  are  the  nondimensional  amplitude  and  frequency  of  vibration. 

In  polar  coordinates  (p,  <f> )  with  origin  in  the  center  of  the  cylinder,  the  Hamiltonian  system 
for  the  motion  of  the  vortex  has  the  form 
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With  e  =  0,  the  system  (2)-(4)  has  a  general  analytical  solution.  The  vortex  trajectories  (4) 
are  presented  in  Fig.  1.  Without  loss  of  generality,  we  assume  that  the  motion  of  fluid  around 
the  vortex  is  clockwise  (<r  <  0)  and  the  direction  of  fluid  velocity  at  infinity  is  from  the  top  to 
the  bottom  on  Fig.  1.  We  see  that  there  is  a  homoclinic  vortex  trajectory  [/>o(i),  ^o(01  w,th 
the  hyperbolic  stationary  point  at  <f>  =  0,p  =  p.,(cr  =  pZ3  —  />.)*  The  homoclinic  trajectory 
separates  the  region  where  the  vortex  is  captured  by  the  body. 

The  stationary  trajectories  of  vortex  in  the  flow  past  circular  cylinder  have  been  studied 
in  Ref.  20  without  recognizing  the  Hamiltonian  structure  of  the  problem.  The  homoclinic 
trajectory  have  not  been  indicated  in  Ref.  20,  probably  because  at  that  time  the  homoclinic 
trajectory  was  considered  as  something  pathological  and  physically  irrelevant  to  the  problem. 
Now  we  know  how  important  the  homoclinic  trajectories  are  for  the  generation  of  chaotic  motion. 

For  e  small  but  non-zero,  the  system  (2),  (3)  has  no  analytic  integrals  of  motion.  It  possesses 
transversal  intersecting  stable  and  unstable  manifolds  —  namely,  the  Poincare  maps  V(to)  which 
advance  a  solution  by  one  period  T  -  2z/u  starting  at  time  t0 ,  possess  transversal  homoclinic 
points.  We  will  show  this  by  using  the  Poincare-Melnikov-Arnold  (PMA)  method.  This  type  of 
behavior  of  dynamical  systems  is  called  chaotic. 

According  to  PMA  method, 21,21,23  we  consider  the  Melnikov  function 


M(t0)  =  jT 


{H0)Hi}(po{t  -  o(t  -  f0)i 


(6) 


where  {,}  denote  the  Poisson  brackets 


()  =  i 

P 


dA(p,  4>,i)dD{p ,  4>,i)  _  8A(ptfat)dB(pt  <jM) 
dp 


d<f> 


d<f> 


dp 


Integral  (6)  is  taken  along  the  homoclinic  trajectory  [p0(f  —  fo)>  <j>{t  —  to)].  We  will  show  that 
M(to)  has  simple  zeros. 

From  (4)-(6),  by  a  change  of  variable,  we  get: 

,r/j  \  f°°  sin[?$o(t  -  to)]  .  ,  „ 

M\t0)  =  -or  /  - - — — - — sin  utdl 

j- oo  pQ\t  ~  Lq) 

=  —erf  — ^—(sinwt  cosuto  +  cosu)ismtot0)  dt  (7) 

•'-oo  pQ\l) 

Since  <f>o (t)  — »  0,  po(t)  — ♦  p .  exponentially  near  the  hyperbolic  stationary  point  (when  t  — »  ±oo),- 
the  integral  (7)  is  convergent.  It  is  convenient  to  choose  the  initial  position:  ^o(0)  =  »r.  In  this 
case  we  have  =  —Mt),  po[-t)  -  Po(0  *nd  (7)  reduces  to: 


\  J  f°°  sin[^o(i)]  .  ,  fl 

M (t0)  =  -a  cosui0  /  — -t :V  ~  sin  cut  dt. 

J- oo  /?o(l] 


Po(i) 


(8) 


The  integral  in  (8)  is  not  identically  zero,  because  it  is  the  Fourier  transform  of  a  function 
which  is  not  identically  zero.  Function  M (to)  clearly  has  simple  zeros.  According  to  PMA  theory, 
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this  proves  that  system  (2,3)  has  no  analytic  integrals  of  motion  and  vortex  trajectory  is  chaotic. 
In  particular,  the  vortex  which  is  initially  far  from  the  body,  can  intersect  the  homoclinic  loop 
and  will  be  captured  by  the  body.  After  several  complicated  revolutions  around  t/fae  body,  the 
vortex  will  eventually  escape.2*  In  connection  with  the  capture-loss  phenomena,  we  have  the 
following  theorem.  Let  S  be  the  set  of  positions  of  the  vortex  at  time  fo>  for  which  the  vortex  for 
all  t  >  tQ  will  stay  inside  a  circle  C  surrounding  the  body.  It  can  be  proven  that  the  subset  of  S’, 
for  which  the  vortex  was  outside  of  C  of  some  t  <  t0,  has  zero  measure.  The  proof  is  the  same  as 
in  the  Littlewood’s  theorem25  for  conservative  (gravitational)  system.  The  only  condition  which 
matters  is  the  preservation  of  volume  in  phase  space.  The  nonautonomous  Hamiltonian  system 
clearly  satisfies  this  condition.  The  theorem  remains  true  if  instead  of  circle  C  surrounding  the 
body,  we  choose  any  area  in  phase  space. 

In  the  case  of  arbitrary  shape  of  a  cylindrical  body  we  still  have  Hamiltonian  superposition 
of  external  and  induced  motion  of  the  vortex  in  terms  of  corresponding  Green’s  functions  for 
the  Laplace  operator.  The  external  velocity  is  finite  everywhere.  The  induced  velocity  is  infinite 
near  the  body  and  zero  at  infinity.  Having  this  in  mind,  we  generally  can  expect  existence 
of  a  homoclinic  separatrice  with  hyperbolic  stationary  point,  where  two  parts  of  velocity  are 
balanced  (in  the  case  of  stationary  external  velocity).  Thus,  the  described  above  phenomena  of 
chaotic  vortex-bodv  interaction  seems  to  be  generic. 

The  local  kinematic  pressure,  exerted  on  the  surface  of  the  body  by  the  vortex,  is  of  the  order 
of  /c2/cP,  where  d  is  the  distance  between  vortex  and  body.  In  the  chaotic  regime  of  motion,  a 
vortex  can  come  closer,  it  will  spend  more  time  near  the  body  and  is  more  likely  to  create  a 
destructive  impact  on  the  body. 

The  generalization  to  the  case  when  we  have  additional  circulation  kq  around  the  cylinder  is 
straightforward.  In  this  case  we  have  to  add  into  (4)  the  term  — cr0\n  p ,  where  crQ  =  K0/2irau0. 
By  using  the  above  described  procedure,  it  is  easy  to  show  that  in  the  time-periodic  external 
flow,  the  motion  of  fluid  particles  becomes  chaotic  even  without  a  vortex  (cr  =  0)  when  |<r0J  >  2. 

Condition  |cr0|  >  2  insures  the  existence  of  a  hyperbolic  stagnation  point  when  <r  —  0. 
Generally,  when  a  ^  Q  and  cr0  ^  0,  the  unperturbed  system  has  several  stagnation  points 
(hyperbolic  and  elliptic).  With  e  ^  0,  the  vortex  can  chaotically  change  the  direction  of  rotation 
during  the  capture.  This  is  clear  physically,  but  it  has  to  be  investigated  in 
detail  numerically. 

In  practical  problems,  the  vortex  has  a  finite  core,  which  leads  to  a  system  with  an  infinite 
number  of  degrees  of  freedom.  If  the  size  of  the  core  is  small  in  comparison  with  the  size  of  the 
body,  the  multipole  representation  of  the  vortex  can  be  used.  In  the  simplest  representation  we 
have  two  closely  located  concentrated  vortices,  which  rotate  around  each  other.  In  this  case  we 
get  chaos  even  without  oscillation  of  the  cylinder  (e  =  0).  The  proof  is  lengthy  and  will  not  be 
presented  here,  but  the  idea  is  simple  —  the  reduction  of  the  Hamiltonian  system.23  The  slow 
variables  are  the  coordinates  of  the  center  of  vorticity,  the  angle  of  mutual  rotation  of  vortices 
plays  the  role  of  time  and  the  distance  between  vortices  is  a  small  parameter.  In  the  case  of 
bigger  distances  between  two  vortices,  one  of  the  vortices  can  be  captured  forever  and  other 
will  escape.  The  capture  of  one  of  the  vortices  does  not  contradict  Littlewood’s  theorem.  This 
kind  of  partial  capture  of  vorticity  in  a  more  complex  situation  happens  when  a  tornado  hits  a 
building. 
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The  problem  of  three-dimensional  chaotic  vortex-body  interactions  with  the  effects  of  vortex 
reconnection  is  more  difficult  and  profound.  In  this  case  we  plan  to  use  the  method 
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Acknowledgements 

Tins  work  is  supported  by  the  U.S.  Department  of  Energy  under  grant  DE-FG030-87ER13801. 
help  in  preparing  Figure  !.  *  9  '  ^  l°  thank  M‘  Mikulska  for 


4 


Figure  Captions 

1.  The  vortex  trajectories  in  the  flow  past  a  cylinder  (/).  »  3), 
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Photograph  Of  Chaotic  Vortex  Interaction  Region  Of  Flow 
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We  consider  the  problem  of  prediction  and  system  identification  for  time  series  having  broadband  power  spectra  which  arise 
from  the  intrinsic  nonlinear  dynamics  of  the  system.  We  view  the  motion  of  the  system  in  a  reconstructed  phase  space  which 
captures  the  attractor  (usually  strange)  on  which  the  system  evolves,  and  give  a  procedure  for  constructing  parameterized  maps 
which  evolve  points  in  the  phase  space  into  the  future.  The  predictor  of  future  points  in  the  phase  space  is  a  combination  of 
operation  on  pact  points  by  the  map  and  its  iterates.  Thus  the  map  is  regarded  as  a  dynamical  system,  not  just  a  fit  to  the  data. 
The  invariants  of  the  dynamical  system  -  the  Lyapunov  exponents  and  aspects  of  the  invariant  density  on  the  attractor  -  are  used 
as  constraints  on  the  choice  of  mapping  parameters.  The  parameter  values  are  chosen  through  a  least-squares  optimization  pro¬ 
cedure  The  method  is  applied  to  “data"  from  the  H<non  map  and  shown  to  be  feasible.  It  is  found  that  the  parameter  values 
which  minimize  the  least-squares  criterion  do  not,  in  general,  reproduce  the  invariants  of  the  dynamical  system.  The  maps  which 
do  reproduce  the  values  of  the  invariants  are  not  optimum  in  the  least-squares  sense,  yet  still  are  excellent  predictors.  We  discuss 
several  technical  and  general  problems  associated  with  prediction  and  system  identification  on  strange  attractors.  In  particular, 
we  consider  the  matter  of  the  evolution  of  points  that  are  off  the  attractor  (where  little  or  no  data  is  available),  onto  the  attractor, 
where  long-term  motion  takes  place. 


A  broadband  power  spectrum  observed  in  the  time 
series  of  a  system  variable  may  have  its  origin  in  noise 
extrinsic  to  the  system.  However,  it  has  become  clear 
in  recent  years  that  its  origin  may  be  in  the  nonper¬ 
iodic,  deterministic  chaos  associated  with  a  nonlin¬ 
ear  system  evolving  on  a  finite-dimensional  strange 
attractor  [  1  ].  In  the  latter  case,  which  is  the  one  we 
address  in  this  note,  the  geometrical  structure  of  the 
attractor  and  thus  of  the  time  series  may  be  exposed 
b>  the  method  of  phase  space  reconstruction  [2-4  J. 
This  takes  an  observed  scalar  variable,  x(n)  = 
x(t0+nAl),  and  produces  a  D-dimensional  embed¬ 
ding  space  from  the  time  lagged  signals  x(n ), 
x(n+r, ),  ...,  x(n+Tz>_i),  where  the  x,  are  appro¬ 
priately  chosen  lags  [5].  The  sequence  of  D-vectors 
for  n=l,  2, ...,  N 

1  Institute  for  Nonlinear  Science. 


J(n)={x(/»),X(/1+T,) . X^+Tj,.,)]  (1) 

describes  the  evolution  of  the  system  in  the  embed¬ 
ding  space.  Theorems  due  to  Takens  and  Mafid  [3,4] 
tell  us  that  if  D  is  about  twice  the  Hausdorff  dimen¬ 
sion  of  the  attractor,  we  are  assured  of  a  good  rep¬ 
resentation  of  that  attractor  by  the  y{n).  In  practice, 
and  in  this  note,  a  good  representation  is  achieved 
by  taking  t,=/t,  with  ra  common  lag,  and  D  the  least 
integer  dimension  greater  than  the  Hausdorff  di¬ 
mension.  Along  with  others,  we  use  properties  of  a 
correlation  function  [6]  to  decide  which  D  is  ap¬ 
propriate  for  a  given  data  set. 

The  evolution  of  vectors  y  in  the  embedding  space  , 
RD  provides  a  dynamically  sound  setting  for  the 
analysis  of  the  broadband  time  series.  The  first  step 
in  our  analysis  is  system  identification  or  parameter 
estimation  for  a  map  F{y,  a)  fiom  R°  to  itself,  de- 
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pending  on  parameters  «s(ai,  ...,  ar).  This  map 
takes y(n)  toy(n+l): 

y(n+l)=F(y(n),*).  (2) 

Given  a  form  for  F(y, «),  the  parameters «  are  to  be 
estimated.  Both  the  form  of  F(y,  t)  and  the  criteria 
for  choosing  a  are  the  issues  of  this  note.  Using  F(y, 
«)  for  prediction  of  the  future  evolution  of  the  un¬ 
derlying  system  or  for  its  control  will  be  the  subject 
of  subsequent  papers  (7], 

Actual  extrinsic  noise  complicates  the  analysis  of 
time  series  and,  of  course,  is  inevitably  present  in  any 
interesting  time  series.  In  this  note  we  ignore  the  role 
of  such  extrinsic  noise.  We  wish  to  separate  the  treat¬ 
ment  of  data  with  intrinsic  broadband  time  series 
from  the  study  of  such  data  contaminated  by  extrin¬ 
sic  noise.  We  will  return  to  the  analysis  of  extrinsi- 
cally  contaminated  chaotic  motion  in  our  later  work 

[7]. 

One  of  our  central  assumptions  is  that  the  points 
y(n)  lie  on  an  attractor  which  is  usually  strange  or 
fractional  dimensional.  Motion  on  the  attractor  is 
taken  to  be  chaotic  or  sensitive  to  initial  conditions. 
Thus,  predicting  the  value  of  points,  y(n),  on  any 
individual  orbit  evolving  from  a  starting  value  y(  1 ): 

y(n)=F(y(n- 1 ),  a)=F(F(y(n- 2),  a),  a) 

=F2(y(n—2),  «)=...=/■"- 1  (y(  1 ),  a)  (3) 

is  a  numerically  uncertain  matter  as  n  grows  large. 
Indeed,  for  familiar  low-dimensional  systems  such 
as  the  Hdnon  map  or  the  Lorenz  attractor  [8],  n 
larger  than  order  10  is  usually  unpredictable  given 
small  machine  or  initial  condition  errors. 

Prediction  of  the  evolution  of  a  point,  y,  on  the 
attractor  in  the  reconstructed  phase  space  may  be  de¬ 
termined  by  looking  at  points  in  the  temporal  past 
of  y,  as  well  as  the  evolution  of  points  that  are  both 
spatially  nearby  and  on  the  attractor.  Knowledge  of 
where  the  neighbors  of  the  point  v  have  evolved  is 
as  important  as  knowledge  of  where  its  temporal 
predecessors  have  been.  This  notion  is  present  in 
several  papers  dealing  with  the  same  general  subject 
as  this  note  [9],  Our  formulation  of  the  concept  is 
embodied  in  the  explicit  analytic  formula  we  give  for 
the  map  P(y,  a).  This  aids  in  both  the  analysis  of  the 
map’s  properties  and  its  use  in  numerical  work.  With 
the  idea  in  mind  of  using  information  about  the 


neighbors  of  a  point  to  assist  in  predicting  where  it 
will  go,  we  choose  mapping  functions  F(y,  a)  of  the 
form*1 

F(y,a)=  £  yU+i)f,(y,y(j),«) ,  (4) 

;-i 

where  the  y(j)  are  the  D-vectors  constructed  from 
the  original  scalar  time  series  x(n).  The  fa(y,  y(j), 
a)  are  functions  parameterized  by  the  a’s  and  vanish 
rapidly  when  |y— y(j)\2/a'*‘  L  (|  |  is  the  Euclid¬ 
ean  distance  in  R°.)  F(y{n),  a)  is  then  determined  ■ 
by  all  the  neighbors  of  y(n)  regardless  of  the  tem¬ 
poral  order  in  which  the  neighborhood  is  visited.  For 
appropriate  choices  of  fa  the  value  of  F(y,  a)  is  a 
weighted  average  of  the  places  in  Rc  to  which  the 
neighbors  of  y  go  in  one  iteration  of  the  map. 

Our  specific  choice  of fa{y,  y(j),  a)  is 

fc(y.yU), «)  =exp(-  |y-y(/)  l2/ff) 

x^a1+a2j'0>ly-.>’(/)] 

+  iak{\y-y{j)\2/or')  (5) 

with  mk  some  choice  of  integers.  This  is  one  among  -i 
a  large  class  of  /’ s  satisfying  our  requirements  of  r 
evolving  points  according  to  the  fate  of  their  neigh-  A 
bors.  This  equation  determines  how  “close”  y  is  to  ■! 
each  of  the  data  vectors  y(j).  For  large  values  of  '1 
\y-yU)\2/o  the  exponential  dominates  and  the  4 
function  approaches  zero.  The  polynomial  softens  tj 
the  exponential  decay  while  simultaneously  provid-  'i 
ing  a  polynomial  fit  to  the  data  for  those  values  of  : 
|y— y(j)  | 2 /a  that  are  small.  '■! 

A  problem  arises  when  f„(y,  y(j),  a)  in  our  map 
depends  only  on  the  Euclidean  distance  of  y-y(j)-  i 
We  find  that  if  c  is  small  enough  to  accurately  fore-  i 
cast  points  on  the  attractor  then  the  Jacobian  matrix' t 
of  F(y,  a)  which  we  will  numerically  evaluate  on  the  i 
data  may  be  so  small  (due  to  the  exponential  term)  j 
that  we  are  unable  to  calculate  Lyapunov  exponents.' } 
The  term  associated  with  a2  provides  for  a  non-zero  i 
Jacobian  when  y  is  evaluated  at  some  y(j)  which  has  ■ 

no  neighbors  in  the  data  set,  A  zero  Jacobian  would  ■ 

j 

"  Arguments  for  the  general  form  here  ire  given  in  ref.  [10].  ’f| 
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! 

C(X,  *)  =  (N—L—  1 )“' 


I 

I 

be  fatal,  since  we  will  eventually  use  F(y,  «)  to  cal¬ 
culate  our  Lyapunov  exponents. 

The  predictor  we  choose  for  a  point  y(N+ 1 )  could 
in  principle  be  just  F(y(N),  a).  It  could  also  be 
F2(y(N- 1 ),  a)  or  FJ(y(N-j+ 1 ),  a)  for  any  j.  We 
are  seeking  a  map,  F(y,  a),  to  reproduce  the  data 
y(  1 ), ....  y(N)  as  well  as  possible,  However,  it  is  also 
critical  that  the  value  of  y(j)  be  the  result  of  iterates 
of  the  map  on  y{J—  1  ),y(j-2)t  •••  etc;  that  is,  we  are 
approximating  the  dynamical  system  which  gives  us 
the  y(J)-  With  this  in  mind  we  have  chosen  as  pre¬ 
dictors  a  weighted  average  over  L  equivalent  forms 
ofy(Ar+ 1 ).  L  will  typically  be  a  small  number,  and 
the  weights  Xj  will  be  chosen  to  decrease  as  j  in¬ 
creases  in  some  manner  consistent  with  one’s  con¬ 
fidence  in  the  reliability  of  FJ(y,  a).  We  choose  then 

>>(W+1)=  1  XjFJ(y(N—j+ 1, a) .  (6) 

j- 1 

If  F(y,  a)  were  the  exact  mapping,  then  each  term  in 
this  sum  would  be  Xjy{N+ 1 )  itself,  thus  we  require 

IA)«1.  (7) 

j-i 

This  choice  of  predictor  is  a  natural  extension  to 
the  nonlinear  situation  of  the  linear  predictive  scheme 

y(N+\)=iXjy(N-j+\)  (8) 

J- 1 

discussed  in  many  places  [11].  By  including  itera¬ 
tions  of  the  map  F(y,  a)  it  explicitly  embodies  the 
idea  that  we  are  dealing  with  an  iterated  map  or  dy¬ 
namical  system.  It  also  provides  a  “lever  arm"  on 
predictions  since  it  looks  back  not  just  one  step  but 
many  to  see  where  a  given  point  y  will  evolve.  Since 
FJ(y,  a)  provides  information  on  the  evolution  of 
the  neighbors  of  the  points  which  end  up  at  y(N+ 1 ) 
this  "lever  arm”  is  both  temporal  and  spatial.  Inter¬ 
estingly,  using  the  optimization  criteria  we  are  about 
to  discuss,  this  method  also  yields  much  better  quan¬ 
titative  fits  to  the  data  than  using  the  single  term 
F(y(N),a). 

In  principle  one  would  establish  the  parameters 
Xz=(X,t ...,  XL)  and  a=(a„  ...,  aP)  by  minimizing 
the  cost j unction 


X  Y  \y(k+  1 )-  £  XjFJ(y(k—j+ 1 ), «)  . 

k^L  I  I 

(9) 

This  alone  would  be  a  fairly  standard  least-squares 
way  of  determining  the  parameters  X  and  a.  Other 
than  the  requirement  that  F(y,  a)  search  phase  space 
neighborhoods  to  determine  where  to  map  y,  this 
minimization  is  familiar.  In  this  paper  we  report  re¬ 
sults  obtained  by  minimizing  C(X,  a)  by  searching  Vv‘ 
various  values  of  a,  for  fixed  values  of  X.  In  our  sub¬ 
sequent  paper  we  discuss  searches  over  both  X  and 
a. 

The  individual  orbits  of  dynamical  systems  of  the 
form  y-*F(y,  a)  which  exhibit  chaotic  motion  are 
sensitive  to  changes  in  initial  conditions  or  roundoff 
error  in  machine  calculations  [1  ].  There  are,  how¬ 
ever,  quantities  that  are  invariant  under  the  motion 
and  are  characteristic  of  the  dynamical  system  that 
gives  rise  to  the  data,  y(n).  The  least-squares  esti¬ 
mation  of  a  by  minimizing  C( X,  a)  does  not  guar¬ 
antee  that  the  resulting  map  F(y,  a)  will  give  the  cor¬ 
rect  invariants.  Even  if  C( X,  a)= 0  we  cannot 
guarantee  that  F(y,  a)  captures  the  full  dynamics  of 
the  system  that  generated  the  data,  since  all  data  are 
subject  to  error,  we  are  using  a  finite  data  set,  and 
only  predicting  a  few  stcDS  into  the  future.  In  any 
event,  in  practice,  we  are  able  to  make  C{X,  a)  small 
but  nonzero.  The  fact  that  C{X,  a)  is  not  zero  im¬ 
plies  that  there  is  certainly  no  guarantee  that  the  in¬ 
variants  will  be  reproduced  by  the  map  F(y,  a).  To 
guarantee  that  the  values  of  the  invariants  of  the  un¬ 
derlying  dynamical  system  are  built  into  the  para¬ 
meterized  maps  F{y,  a),  we  seek  the  parameters  val¬ 
ues  that  minimize  C(X,  a)  subject  to  the  constraints 
that  F(y,  a)  yield  the  correct  values  for  the  invariant 
characteristics.  To  implement  this  we  need  to  iden¬ 
tify  the  relevant  invariants,  determine  them  from  the 
data  set  in  a  manner  independent  of  the  least-squares 
minimization,  and  give  rules  on  finding  them  for  any 
map  F(y,  a). 

We  are  aware  of  two  kinds  of  invariants  for  dy¬ 
namical  systems  [1].  Both  are  connected  with  er- 
godic  properties  of  the  true  underlying  dynamics  that 
generated  the  data  set.  The  first  is  the  set  of  char¬ 
acteristic  Lyapunov  exponents,  A,,  A2,  ....  A^.  The 
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second  is  the  invariant  density  of  points  on  the  at¬ 
tractor,  p(y). 

We  assume  that  when  C(X,a)  is  near  zero  the  map 
F(y,  a)  is  ergodic  on  the  attractor.  If  this  is  true,  then 
the  A2’,p  are  given  by  Oseledec’s  multiplicative  er¬ 
godic  theorem  [12]  as  the  logarithms  of  the  eigen¬ 
values  of  i 


[  (DFu(y(  1 ) )  VDF“(y{  1 ) )'  ] t/2M 


(10) 


as  M-*oo,  where  the  dagger  denotes  Hermitean  con¬ 
jugate,  DF(y  j  is  the  Jacobian  matrix 


D  F(y)afi= 


and 


dFa(y,a) 

dy/> 


(11) 


D/'A/(y)  =  DF(FA/-,(y1 «) )..JDF(F(j>, a) )DF(y)  . 

(12) 

The  2’s  are  invariant  in  the  sense  that  all  initial  points 
y  ( 1 )  that  are  on  the  attractor  yield  the  same  values 
for  the  X's. 

If  only  27"p  is  desired,  then 


M 


log{Tr[DFv(y)  ]} 


(13) 


gives  a  very  accurate  value  for  as  M  becomes 
large.  Basically  this  is  because 

Tr[DFv(y)]  =  £  exp(3/2a)*exp(Af2,)  (14) 

aw  I 

for  large  M.  This  rule  for  finding  2|nip  from  the  map 
F(y,  a)  turns  out,  in  practice,  to  be  easy  to  program 
and  completely  adequate  for  use  in  constraining  the 
optimization  of  C(X, «). 

Establishing  the  Lyapunov  characteristic  expo¬ 
nents  from  data  turns  out  to  be  a  delicate  procedure 
for  all  but  the  largest  positive  exponent,  2?*u  [13]. 
This  appears  to  be  operationally  the  case  whether  the 
data  comes  from  an  experiment  or  is  computer  gen¬ 
erated.  Hence,  we  have  chosen  to  restrict  our  atten¬ 
tion  here  to  the  largest  positive  Lyapunov  exponent. 
Let  us  suppose  has  been  determined  from  the 
data,  y  (n ).  The  equality  XT**  =2?*u  forms  one  of  the 
constraints  we  will  impose  on  the  minimization  of 
the  cost  function. 

It  seems  to  us  a  matter  of  some  interest  to  create 
reliable,  efficient  methods  for  the  determination  of 


the  full  Lyapunov  spectrum  from  data  and  from  maps 
F(y,  *). 

We  now  turn  our  attention  to  the  second  type  of  • 
invariant,  the  invariant  density,  p(y).  The  invariant 
density  of  a  map,  F(y,  a),  is  defined  as  t  - 


p(y)mn>=  lim  ~  Z  fD(y-Fk(y( !),«)) .  (15>.| 

u-oo  M k- 1 


For  an  ergodic  map  (which  we  have  assumed  our  1 
map  to  be)  p(y)  is  invariant  in  the  sense  that  all  ini-  | 
tial  points  y(  1 )  that  are  on  the  attractor  yield  the  ■ 
same  value  for  p(y).  The  spatial  average  of  g(y),  air* 
arbitrary  function  on  phase  space,  is  given  by 

<^>  =  JdDy/)0-)g(y) . 

For  ergodic  maps  < g >  is  invariant  under  the  action 
of  F(y,  a),  i.e. 


|  dDyp{y)mmpg(y)  =  <£> 

map 

=  |  dDyp(y)mipg{F(y,a)) . 


(16) 


The  invariant  density  determined  by  the  data  is 
clearly  given  as 

P{y) d»u=  t;  I  SD(y-y(k)) .  (17) 

Thus,  <g><uu  is  given  by 

£[y(fc)]  .  (18): 

The  second  constraint  we  will  impose  on  C(X, «) 
is  the  equality  of  p(y )mtp  with  p{y)( Uu.  We  cannot 
constrain  the  minimization  of  C{X,  a)  by  the  full 
content  of  p(y),  since  it  contains  an  infinite  amount 
of  local  information.  However,  we  can  choose  some 
specific  functions  g,(y)  which  we  deem  important 
about  the  dynamical  system  and  require  <&>,»*, 
<£/>d.u  as  constraints.  For  this  paper  we  report  the) 
results  obtained  by  using  grr|«]4exp{-  il”l' 
There  is  no  intrinsic  significance  to  this  phase  funo 
tion,  but  it  does  have  contributions  from  large  values 
of  Euclidean  distances  on  the  attractor.  Also  it  is  not 
connected  in  any  direct  way  with  the  function  C(X 
a)  and,  thus,  contains  information  about  the  attrao- 
tor  quite  different  from  C(X,  a).  '-'H 

An  obvious  question  is  what  is  the  best  choice  of 
moments,  <g> ,  to  use  to  constrain  the  cost  function. 
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To  answer  this  we  expand  p(y)  in  terms  of  some  set 
of  orthonormal  functions  Y„{y)  which  are  concen¬ 
trated  on  the  attractor, p(y)  «=  i-S^fy).  The  re¬ 
quirement  that  the  y^(y)’s  be  concentrated  on  the 
attractor  implies  that  the  number  of  these  functions 
needed  to  represent  p(y)  accurately  (within  the  res¬ 
olution  given  by  finite  W)  can  be  small.  Thus  the  de¬ 
tails  of  p(y)  can  be  transmitted  in  terms  of  a  set  of 
functions  “tuned"  to  the  attractor  and  their  coeffi¬ 
cients.  One  learns  the  functions  from  one  data  set 
and  uses  them  as  a  basis  set  for  future  data  sets.  If 
we  knew  the  B we  would  be  able  to  reconstruct  the 
phase  space  average  of  any  g(y)  since 

<g>  =  dDyi/fM(jy)  giy)^j  •  (19) 

The  term  in  large  parentheses  is  independent  of  the 
dynamical  system.  It  depends  on  the  phase  function 
g(y)  and  the  basis  vectors  only. 

The  Bm’$  are  the  moments  (phase  space  averages) 
of  the  eigenfunctions  y^(y)  and  constitute  the  op¬ 
timal  moments  to  use  in  constraining  the  cost  func¬ 
tion.  In  the  larger  paper  which  follows  this  note  [7] 
we  show  how  the  B„=f  d Dyp(y)  y^(y)  may  be  used 
to  constrain  the  minimization  of  C(X,  a).  The  key 
is  the  choice  of  the  which  we  select  as  the  op¬ 

timum  Karhunen-Loive  eigenfunctions  [14]  of  the 
correlation  matrix  formed  by  independent  samples 
of  A^ty).  These  eigenfunctions  are  automatically 
concentrated  on  the  attractor. 

We  have  implemented  the  program  outlined  here 
using  “data"  generated  by  the  H6non  map  of  the 
plane  (*,,  x2)  to  itself, 

*i(rt+l)  =  1.0-ax, («)J+x2(n) , 
x2(/i+l)=ix,(/t) ,  (20) 

with  the  familiar  parameter  values  a=  1.4  and  b— 0.3. 
Starting  with  data  on  x,(o),  n—i,  2,  ...,  N+l  we 
constructed  the  two  vectors  y(n)  =  (x,(n), 

x,  (n+ 1 ) )  for  n=  1, 2 N as  our  basic  data.  In  units 

set  by  the  map  itself,  we  found  the  minimum  dis¬ 
tance  between  points  on  the  attractor  to  be  a  10_< 
to  10“5  when  N^500.  We  chose  the  parameter  o  in 
our  map  F(y,  a)  to  be  about  100  times  the  square  of 
this  minimum.  For  large  N  this  means  that  most 
points  in  the  data  set  will  have  some  neighbors.  Spe¬ 
cifically  we  took  tr=  3.4  x  10" 7. 


For  our  predictor  we  took  three  terms; 
y{l+ 1  )=XlF{y{l),  *)+X2P{y{l- 1 ),  a) 

+X>P(y(I-2),M).  (21) 

Recall  that  we  chose  to  fix  the  X,  rather  than  vary 
them  in  this  note.  To  reflect  our  greater  confidence 
in  the  lower-order  iterates  of  F(y, «),  we  took  A'j  =0.8 
and  X2 = A'j = 0. 1 .  Similar  values  of  the  Xi  give  much 
the  same  qualitative  results.  In  the  work  we  report 
here  we  have  chosen  the  number  of  parameters  a  to 
be  4.  We  constrained  the  minimization  of  C(X,  a) 
over  the  a  with  >1,  and  with  the  mean  of  the  phase 
function  <g>  so  when  both  constraints  are  imposed 
we  have  two  free  variables  among  the  a.  The  powers 
m}  and  m4  in  the  F(y,  a)  were  chosen  to  be  fixed 
during  the  minimization  of  C(X,  a).  We  took  them 
to  be  m j = 4  and  m4 = 5.  A  parameter  search  over  the 
mk  could  have  been  done  as  well;  but  we  have  not 
done  this. 

,There  is  an  important  item  which  we  were  re¬ 
quired  to  address  in  our  work.  It  is  quite  general,  so 
we  discuss  it  and  our  solution  to  it  before  reporting 
on  particular  calculations.  The  data  we  are  given  lies 
on  the  attractor,  indeed,  it  defines  the  attractor.  The 
attractor  is  an  object  of  zero  volume  in  R°  since  its 
dimension  is  less  than  D.  We  have,  thus,  no  infor¬ 
mation  from  the  data  on  the  behavior  of  the  map 
y-»F(y,  *)  when  y  lies  in  most  of  R°.  Our  map  must 
contain  some  rule  which  takes  points  y  that  are  off 
the  attractor  and  brings  them  onto  the  attracting  set. 
Our  class  of  maps  F(y,  a)  does  that  by  mapping 
points  that  are  off  the  attractor,  which  clearly  have 
no  neighbors  among  the  data y(n),  to  the  originy=0. 
We  have  addressed  this  matter  by  translating  the  or¬ 
igin  of  the  coordinate  system  in  which  the  y(n)  are 
given  to  lie  well  within  *Ja  of  some  data  point 
(Which  data  point  we  chose  seemed  not  to  matter. ) 
When  the  parameters  a  have  reached  values  near  the 
optimum  and  that  optimum  is  doing  a  good  job  of 
tracking  the  data,  this  translation  of  the  origin  is 
doing  nothing.  While  we  are  searching  the  a,  how¬ 
ever,  and  are  far  from  the  optimum,  this  translation 
reinjects  points  mapped  off  the  attractor  by  a  bad 
F(y,  a)  back  onto  or  very  near  the  attractor.  This  de¬ 
vice,  or  an  equivalent  one,  provides  both  stability  and 
logic  to  the  parameter  search  and  is  certainly  needed 
for  any  Z>>2. 

If  there  were  a  way  to  probe  the  system  producing 
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the  data,  we  could  avoid  this  need  for  reinjection  by 
pulsing  the  probe  and  letting  the  system  itself  ex¬ 
plore  phase  space  off  the  attractor.  The  information 
we  need  would  then  be  in  the  data  set  The  usual  sit¬ 
uation  we  anticipate  is  that  the  scalar  time  series  x(n ) 
represents  long-time  behavior  of  the  system  and  that 
only  motion  on  the  attractor  is  represented  in  the 
data.  • 

In  our  work  with  data  from  the  Hinon  map  we  be¬ 
gan  with  jV=  750.  This  was  large  enough  to  cover  the 
attractor  reasonably  densely  and  allowed  us  to  make 
calculations  quickly.  We  also  tried  to  choose  N  large 
enough  to  give  an  accurate  picture  of  the  attractor 
but  small  enough  that  we  could  imagine  ourselves 
operating  on  a  small  data  set  we  had  been  given.  Our 
choice  of  N  was  partly  motivated  by  our  observa¬ 
tions  that  the  method  we  use  to  compute  X ,  when 
applied  to  the  H6non  map  directly  undergoes  some 
fluctuations  for  N  much  smaller  than  400  and  has 
settled  down  to  a  value  X  t =0.408...  after  that.  This 
value  is  consistent  with  other  determinations  of  this 
Lyapunov  exponent,  so  we.  were  confident  we  had 
not  chosen  too  small  a  value  of  N.  With  this  value 
of  N,  we  searched  the  parameters  a  to  minimize  C(X, 
a).  Our  search  utilized  the  software  package  NPSOL 
[15],  which  docs  not  search  for  global  minima,  so 
some  variation  of  initial  values  of  the  «  was  needed 
on  our  part.  After  some  looking  around  we  found  a 
very  shallow  minimum  in  the  variable  a2  near  which 
the  other  a,  took  the  same  values  for  large  excursions 
in  a2.  The  values  of  the  parameters  at  the  minimum 
were  a,  =  1.18  a2= 607,  a3=— 0.0784,  and 
04=0.0126.  At  these  values,  the  cost  function  C(X, 
a)  took  the  value  0.03967,  while  the  Lyapunov  ex¬ 
ponent  X  t  was  5.95,  rather  than  0.408,  and  the  value 
of  the  phase  space  average  of  g(y)  was  6.18  rather 
than  the  value  of  2.81  computed  from  the  data. 
Clearly  we  achieved  a  very  good  “fit”  to  the  data  as 
far  as  the  cost  function  C(X,  a)  was  concerned,  but 
the  map  F(y,  a)  at  that  set  of  parameters  a  had  little 
to  do  with  the  dynamical  system  generating  the  data. 

Next  we  imposed  only  the  Lyapunov  constraint 
(Am*p=2d*u)  on  the  minimization  of  the  cost  func¬ 
tion.  The  parameter  values  found  in  our  search  were 
<2i  =  1.19,  a2=0.816,  a3=0.000764,  and 

a4=  -0.01 21.  At  this  point  the  value  of  C(X,  a)  was 
0.04175,  which  is  still  an  excellent  “fit"  in  a  least- 
squares  sense.  The  Lyapunov  exponent  now  differed 


from  its  value  taken  from  the  data  by  0.05!  This  ac¬ 
tually  was  a  general  rule  we  saw  in  our  fitting  of  F(y , 
«);  namely,  the  smallest  C(X, «)  did  not  give  good 
values  for  the  constraints,  and  a  larger,  though  often 
not  much  larger,  cost  function  was  found  by  the  con¬ 
strained  optimum.  This  means  that  while  the  con¬ 
strained  optimum  will  give  a  slightly  worse  point  to 
point  prediction  of  future  values  of  points  in  it 
will,  by  construction,  give  better  global  properties. 

Finally  we  imposed  the  equality  of  both  the  largest 
Lyapunov  exponent  (Anup=2d*“ )  and  the  phase  space^ 
average  (<s>m.P=<£>d..«).  The  precise  minimum” 
resulting  from  the  application  of  the  search  routine 
NPSOL  depends  on  the  acceptable  limits  we  put  on 
the  satisfaction  of  the  constraints.  All  minima  were 
in  the  neighborhood  of  C(X,  a) =0.09.  For  example, 
requiring  each  of  the  constraints  to  be  met  to  an  ac¬ 
curacy  of  0.005,  led  to  a  cost  function  of  0.09082. 
Relaxing  this  to  an  accuracy  of  0.01,  led  to  a  cost 
function  of  0.09063,  which  is  essentially  the  same. 

’  The  parameter  values  shifted  around  in  a  common 
neighborhood  for  all  these  limits  imposed  on  the 
constraints,  indicating  we  we  re  just  moving  around 
the  same  constrained  minimum  at  various  small  dis¬ 
tances.  For  constraints  required  to  be  satisfied  within 
±0.006,  the  values  of  the  a,  were  a, = 0.940,  a2=  1.21, 


a3=  —  0.0260,  and  04=0.00188. 

What  is  important  here  is  not  the  specific  set  of 
values  of  the  parameters  «.  Rather,  it  is  that  we  are 
able  to  find  a,’s  that  meet  both  of  our  constraints  and 
that  although  the  o/s  that  give  accurate  values  of  the 
constraints  are  quite  different  from  those  of  the  un¬ 
constrained  minimum,  they  still  give  an  acceptably 
small  cost  function. 

We  repeated  this  kind  of  calculation  with  a  variety 
of  values  of  N  and  on  data  sets  generated  with  dif¬ 
ferent  initial  conditions  for  the  underlying  H6non 
map.  For  A=  1200  and  A=  1700,  for  example,  we 
report  in  table  1  the  results  of  calculations  precisely 
along  the  lines  just  discussed.  They  are  rather  similar 
in  character  to  the  results  for  W=750  just  reported 
and  to  the  results  for  other  values  of  N  we  explored/ 
In  the  table  we  show  the  C(X,  a)  for  both  the  con¬ 
strained  and  unconstrained  optimization.  The  val¬ 
ues  of  the  constraints  2 1  and  <g>  are  shown  for  the 
map  both  when  those  constraints  have  been  imposed 
and  when  they  have  not. 
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Table  1  • 

Optimization  results.  C(X,  t)  (e<j.  (9)  with  F(y,  «)  from  eq.  (4))  ia  *hown  with  and  without  invariant  constraints.  The  following 
parameter!  are  used  for  all  X,  m  0. 8,  X} »«  0. 1 ,  X> »  0. 1 ,  if***  =0.408. 


V 

C(Y.a) 

a, 

aj 

A, 

2rp 

<*>"“* 

750 

2.81 

unconst. 

0.03502 

1.18 

607 

-0.0784 

0.0126 

5.95 

6.18 

At 

0,04175 

1.19 

0.816 

0.000764 

-0.0121 

0.403 

6.16 

<e> 

0.09105 

0.940 

1.21 

-0.0260 

0.00188 

0.406 

2.82 

1200 

2.81 

unconst. 

0.03090 

1.19 

14.30 

-0.0650 

0.00912 

2.72 

6.24 

X, 

0.03113 

1.17 

0.677 

-0.0645 

0.00961 

0.408 

5.96 

* 

Xu  <g> 

0.0851 

0.966 

-0.718 

-0.00757 

-0.00649 

0.411 

2.81 

1700 

2.90 

unconn. 

0.03444 

1.16 

-13.36 

-0.0809 

0.00150 

2.50 

6.48 

X, 

0.03894 

1.17 

-0.246 

-0.126 

0.00301 

0.405 

6.32 

Xu  <g> 

0.0926 

0.939 

0.478 

-0.0355 

-0.00237 

0.409 

2.91 

We  have  now  demonstrated  that  using  a  predictor 
of  the  form 

y(N+l)=ixjFJ(y(N-j+l,a) 

Jm  I 

with  our  class  of  mapping  functions  F(y,  a)  can  give 
excellent  least-squares  fits  to  chaotic  time  series  data 
on  a  strange  attractor  while  simultaneously  satisfy¬ 
ing  constraints  on  those  fits  dictated  by  the  geomet¬ 
rical  invariants  which  characterize  that  attractor. 
Furthermore,  a  straightforward  least-squares  fit  to 
the  data  does  not,  in  general,  reproduce  the  dynam¬ 
ical  information  on  Lyapunov  exponents  and  invar¬ 
iant  densities  that  are  contained  in  the  data  itself.  To 
produce  the  correct  value  of  the  invariants  we  must 
accept  a  larger  cost  function.  This  loss  in  least-squares 
based  predicted  power  is  made  up  for  by  the  built  in 
quality  of  our  predictors;  namely,  they  will  produce 
the  correct  long-term  statistical  behavior  of  the  dy¬ 
namical  system  whose  properties  we  are  trying  to 
learn  by  the  analysis  of  the  original  scalar  time  series. 

Several  directions  are  clear  for  further  investiga¬ 
tion.  One  is  to  apply  this  set  of  methods  to  higher- 
dimensional  systems  both  for  computer  generated 
data  and  for  experimental  data.  Before  the  latter  is 
addressed  we  must  study  in  detail  the  important  is¬ 
sue  of  how  to  treat  extrinsic  noise  within  the  kind  of 
phase  space  description  of  time  series  we  are  using 
here.  Some  ideas  on  that  are  contained  in  other  work 
[  9, 1 6  ] ,  though  no  consideration  of  it  has  been  given 
here.  Another  quite  interesting  issue  is  the  devel¬ 
opment  of  a  set  of  reliable  and  efficient  algorithms 


for  the  extraction  of  invariant  quantities  such  as 
Lyapunov  exponents  from  data.  Finally  the  exten¬ 
sion  of  the  methods  demonstrated  here  to  systems 
with  spatial  degrees  of  freedom  would  be  most  in¬ 
teresting.  We  plan  to  discuss  many  of  these  items  in 
our  own  expansion  of  this  short  note  {7], 
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We  consider  the  problem  of  prediction  ted  system  identification  for  time  series  having  broadband 
power  spectra  that  arise  from  the  intrinsic  nonlinear  dynamics  of  the  system.  We  view  the  motion 
of  the  system  in  a  reconstructed  phase  space  that  captures  the  attractor  (usually  strange)  on  which 
the  system  evolves  and  give  a  procedure  for  constructing  parametrized  maps  that  evolve  points  in 
the  phase  space  into  the  future.  The  predictor  of  future  points  in  the  phase  space  is  a  combination 
of  operation  on  past  points  by  the  map  and  its  iterates.  Thus  the  map  is  regarded  as  a  dynamical 
system  and  not  just  a. fit  to  the  data.  The  invariants  of  the  dynamical  system,  the  Lyapunov  ex¬ 
ponents  and  optimum  moments  of  the  invariant  density  on  the  attractor,  are  used  as  constraints  on 
the  choice  of  mapping  parameters.  The  parameter  values  are  chosen  through  a  constrained  least- 
squares  optimization  procedure,  constrained  by  the  values  of  these  invariants.  We  give  a  detailed 
discussion  of  methods  to  extract  the  Lyapunov  exponents  and  optimum  moments  from  data  and 
show  how  to  equate  them  to  the  values  for  the  parametric  map  in  the  constrained  optimization.  We 
also  discuss  the  motivation  and  methods  we  utilize  for  choosing  the  form  of  our  parametric  maps. 
Their  form  has  a  strong  similarity  to  the  work  in  statistics  on  kernel  density  estimation,  but  the 
goals  and  techniques  differ  in  detail.  Our  methodology  is  applied  to  "data”  from  the  Henon  map 
and  the  Lorenz  system  of  differential  equations  and  shown  to  be  feasible.  We  find  that  the  parame¬ 
ter  values  that  minimize  the  least-squares  criterion  do  not,  in  general,  reproduce  the  invariants  of 
the  dynamical  system.  The  maps  that  do  reproduce  the  values  of  the  invariants  are  not  optimum  in 
the  least-squares  sense,  yet  still  are  excellent  predictor^.  We  discuss  several  technical  and  general 
problems  associated  with  prediction  and  system  identification  on  strange  attractors.  In  particular, 
we  consider  the  matter  of  the  evolution  of  points  that  are  off  the  attractor  (where  few  or  no  data  are 
available),  onto  the  attractor  where  long-term  motion  takes  place.  We  find  that  we  are  able  to  real¬ 
ize  maps  that  give  a  least-souares  approximation  to  the  data  with  rms  variation  over  the  attractor  of 
0.5%  or  less  and  still  reproduce  the  dynamical  invariants  to  5%  or  better.  The  dynamical  invari¬ 
ants  are  the  classifiers  of  the  dynamical  system  producing  the  broadband  time  series  in  the  first 
place,  so  this  quality  of  the  maps  is  essentia]  in  representing  the  correct  dynamics. 


I.  INTRODUCTION 


A  General  remarks 


Analysis  of  time  series  from  dynamical  systems  is  an 
important  issue  in  many  different  fields  of  engineering 
and  science.  The  most  common  tool  for  this  analysis  is 
the  Fourier  (or  other  similar)  transform  of  the  data  x(n) 
to  discover  sharp  lines  in  its  power  spectrum.  Spectral 
identification  lies  at  the  heart  of  much  of  the  work  on 
linear  systems  to  which  time  series  analysis  is  applied.1,2 
When  one  encounters  a  broadband  power  spectrum,  the 
common  assumption  is  that  it  represents  extrinsic  noise 
and  not  characteristics  of  the  signal. 

It  has  become  increasingly  clear  in  recent  years  that 
nonlinear  systems  exhibiting  deterministic  chaos  will  gen¬ 
erate  a  time  series  whose  power  spectrum  is  broadband. 
Gencrically,  dissipative  nonlinear  chaotic  systems  evolve 


nonncriodically  on  a  strange  attractor  that  lives  in  m 
phase  space  of  finite  (and  often  small)  dimension.  Noix 
does  not  evolve  on  a  strange  attractor  and  will  occupy  a* 
arbitrarily  large  number  of  dimensions.  Hence  to  modd 
nonlinear  chaotic  systems  as  noise  is  certainly  incorrect. 
For  these  systems  the  source  cf  the  broadband  spectrin* 
is  the  intrinsic  chaotic  dynamics  that  underlies  the  time 
series. 

Our  focus  in  this  work  is  on  signals  w»-h  a  substantial 
broadband  power  spectrum  which,  since  external  noise  is 
absent  or  very  small,  represents  the  nonperiodic  behavior 
of  a  dynamical  system  whose  orbits  lie  on  »  strange  at¬ 
tractor.  The  idea,  now  rather  well  established,  that  such 
an  object  can  have  a  small  fractal  dimension  (and  still 
govern  the  long  time  evolution  of  a  system  with  far  more 
numerous  degrees  of  freedom  than  represented  by  the  di¬ 
mension  of  the  attractor)  is  really  the  starting  point  of 
our  work.3,4 
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PREDICTION  IN  CHAOTIC  NONLINEAR  SYSTEMS:  . . , 


It  is  very  important  that  though  x(n )  may  be  a  long, 
quiet  data  set  it  is  likely  to  have  a  very  broad  power  spec¬ 
trum.  Indeed,  if  the  signal  one  is  studying  has  a  power 
spectrum  with  substantial  strong  lines,  one  is  well  advised 
to  recognize  the  implied  sinusoids  as  the  underlying 
linear  degrees  of  freedom  and  avoid  altogether  the  labor 
we  propose  here. 

It  has  been  shown  that  in  nonlinear  systems  that  exhib¬ 
it  deterministic  chaos  one  can  determine  from  the  obser¬ 
vation  of  a  single  dynamical  variable  the  geometric  struc¬ 
ture  of  the  many  variable  dynamics  that  produced  the 
measured  signal.5-10  The  method  that  has  developed  for 
the  construction  of  the  phase  space  in  which  the  dynam¬ 
ics  dwells  is  called  phase-space  reconstruction.  The  result 
of  this  reconstruction  is  an  embedding  space  of  d  dimen¬ 
sions  (d  is  an  integer)  in  which  one  may  observe  the  at¬ 
tractor.  One  can  view  the  evolution  in  the  reconstructed 
phase  space  of  the  many  dimensional  dynamics  in  a  quan¬ 
titative  fashion  in  the  time  domain. 

In  this  article  we  describe  both  in  outline  and  im¬ 
plementation  a  program  for  extracting  from  the  observa¬ 
tions  of  this  single  broadband  temporal  signal  quantita¬ 
tive  predictions  for  the  evolution  of  initial  conditions 
differing  from  the  observed  data  points.  We  assume  that 
once  transients  are  gone  the  evolution  of  the  system  is  on 
a  strange  attractor  with  dimension  d A ,  where  dA  is  gen¬ 
erally  fractional.  If  the  evolution  of  the  system  is  on  such 
an  attractor,  then  the  d-dimensional  embedding  space  en¬ 
closing  the  attractor  should  be  sufficiently  larger  than  dA 
that  all  the  geometric  information  about  the  attractor  is 
exposed  in  the  embedding  space.  Mafie  and  Takens’s6,7 
formal  result  requires  d  >  ldA  + 1  to  assure  one  of  a  faith¬ 
ful  representation  of  the  dA  -dimensional  attractor  as  seen 
in  the  d-dimensional  embedding  space,  but  often,  in  prac¬ 
tice,  d>dA  will  do.  The  method  of  phase-space  recon¬ 
struction  seeks  to  construct  from  the  x(n)’s  d- 
dimensional  vectors  which,  when  embedded  in  Rd  de¬ 
scribes  the  full  dynamical  evolution  of  the  system.  Sec¬ 
tion  II  is  devoted  to  the  issue  of  identifying  the  correct 
value  of  d  from  the  data  set. 

For  the  moment  suppose  we  have  found  d  by  one 
means  or  another.  We  imagine  measuring  a  single  scalar 
variable  x  at  discrete  time  points  x(n)  for 
n  =  1,2, . . . ,  ND.  (Observation  of  several  dynamical  vari¬ 
ables  from  the  system  is  even  better,  and  serves  to  pro¬ 
vide  confirmation  of  the  information  on  the  deductions 
from  observations  of  any  single  variable.)  We  can  con¬ 
struct  d-dimensional  vectors  y(n )  in  the  embedding  space 
by 

y(n)=(*(n),x(n+T,),Jc(«+T2), . . .  ,x(n+Trf_,))  , 

for  some  set  of  time  lags  T|,t2,  . . .  ,rrf_|.  The  set  of 
yin  )’s,  of  which  we  have  N=ND—d,  capture  the  evolu¬ 
tion  of  the  nonlinear  system  under  observation  as  it 
moves  through  the  d-dimensional  phase  space.  Familiar 
phase-space  coordinates  are  the  time  derivatives 
x(n),dx{n)/dt,d2x{n)/dt2, . . . ,  evaluated  at  discrete 
times.  The  data  on  x(n)  are  acquired  only  at  discrete 
times  and  establishing  the  values  of  these  derivatives  is 
certain  to  be  inaccurate.  The  time  lagged  x(n  )’s,  which 


are  the  coordinate  elements  of  the  y(n  )’s,  are  nonlinear 
combinations  of  the  local  time  derivatives  and  are  fully 
acceptable  substitutes  for  the  usual  phase-space  coordi¬ 
nates.  This  has  been  emphasized  by  Eckmann  and 
Ruelle.10 

With  the  y in  )’s  and  the  embedding  space  in  hand,  we 
aik  here  the  ambitious  question  of  how  we  can  use  the 
aeries  of  y(n)’s  to  predict  yiN+l),yiN+2),  etc. 
Equivalently,  we  can  ask  what  is  the  evolution,  under  the 
same  dynamical  system  that  produced  the  y(n)’s,-of  a 
point  y,  that  is  on  the  attractor  but  not  in  the  original 
data  set.  We  will  have  answered  this  question  when  given 
a  data  set  y(  1  ),y(2), . . .  ,y(N),  we  have  identified  a  “reli¬ 
able”  map  F  from  Rd  to  itself  parametrized  by 
a— (a,,a2, . . .  ,Op)  which  takes  us  from  y(n)  to  yin  +  1), 

y(n  +  l)=F(y(rt),a)  . 

If  we  can  establish  a  reliable  F(y,a),  then  the  evolution 
of  a  point  y  in  that  is  not  a  member  of  the  measur¬ 
ed  data  set  would  be  y-»yj=F(y,a),  yt— *-y2:=F(y1,a) 
=F(F(y,a),a)=F2(y,a),  etc. 

Our  first  view  of  the  data  y(l),y(2), . . .  ,yiN)  is  that  it 
can  be  thought  of  as  a  pair  of  columns  of  vectors  in  Rd 

y(l)  y(2) 

y(2)  y(3) 

y(n)  y(n  +  l) 

yiN-1)  y IN), 

and  our  function  F(y,a)  comes  from  parametrically 
"fitting”  the  right-hand  column  of  y(n  +  l)  resulting 
from  the  left-hand  column  of  yin ).  Fitting  the  data  then 
suggests  making  a  least-squares  estimation  of  a  so  that 
the  cost  function 

Ci a)=2'  2  lyJn  +  D-FJyin),*)]2 

n  ■=  I  m  «*  I 

is  minimized.  Our  approach  differs  from  previous  work 
in  detailed  tactics  and  in  our  imposition  of  important 
geometrical  structure  as  constraints  on  the  minimization 
of  the  cost  function.  The  articles  we  have  greatly  relied 
on  for  guidance  and  initial  impetus  in  our  research  are 
those  by  Farmer  and  Sidorovitch11  (we  refer  to  this  paper 
as  FS  in  the  following),  Lapedes  and  Farber,  and 
Crutchfield  and  McNamara.13 

Our  main  point,  simply  stated,  is  that  we  are  not  just 
making  a  fit  to  data  with  a  set  of  functions  F(y,a).  Rath¬ 
er,  these  functions  evaluated  along  the  orbit  are  to  be  re¬ 
lated  to  each  other  in  the  manner  of  a  dynamical  system. 
This  leads  to  a  rather  different  view  of  the  fitting  func¬ 
tions  than  the  one  usually  taken  in  trying  to  match  data 
to  observations.  It  means  that  the  function  F(y,a)  evalu- 
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It  is  very  important  that  though  x(n )  may  be  a  long, 
quiet  data  set  it  is  likely  to  have  a  very  broad  power  spec* 
trum.  Indeed,  if  the  signal  one  is  studying  has  a  power 
spectrum  with  substantial  strong  lines,  one  is  well  advised 
to  recognize  the  implied  sinusoids  as  .the  underlying 
linear  degrees  of  freedom  and  avoid  altogether  the  labor 
we  propose  here. 

It  has  been  shown  that  in  nonlinear  systems  that  exhib¬ 
it  deterministic  chaos  one  can  determine  from  the  obser¬ 
vation  of  a  single  dynamical  variable  the  geometric  struc¬ 
ture  of  the  many  variable  dynamics  that  produced  the 
measured  signal.5-10  The  method  that  has  developed  for 
the  construction  of  the  phase  space  in  which  the  dynam¬ 
ics  dwells  is  called  phase-space  reconstruction.  The  result 
of  this  reconstruction  is  an  embedding  space  of  d  dimen¬ 
sions  (d  is  an  integer)  in  which  one  may  observe  the  at¬ 
tractor.  One  can  view  the  evolution  in  the  reconstructed 
phase  space  of  the  many  dimensional  dynamics  in  a  quan¬ 
titative  fashion  in  the  time  domain. 

In  this  article  we  describe  both  in  outline  and  im¬ 
plementation  a  program  for  extracting  from  the  observa¬ 
tions  of  this  single  broadband  temporal  signal  quantita¬ 
tive  predictions  for  the  evolution  of  initial  conditions 
differing  from  the  observed  data  points.  We  assume  that 
once  transients  are  gone  the  evolution  of  the  system  is  on 
a  strange  attractor  with  dimension  d A ,  where  dA  is  gen¬ 
erally  fractional.  If  the  evolution  of  the  system  is  on  such 
an  attractor,  then  the  d-dimensional  embedding  space  en¬ 
closing  the  attractor  should  be  sufficiently  larger  than  d A 
that  all  the  geometric  information  about  the  attractor  is 
exposed  in  the  embedding  space.  Matte  and  Takens’s6,7 
formal  result  requires  d  >  2dA  + 1  to  assure  one  of  a  faith¬ 
ful  representation  of  the  dA  -dimensional  attractor  as  seen 
in  the  d-dimensional  embedding  space,  but  often,  in  prac¬ 
tice,  d>dA  will  do.  The  method  of  phase-space  recon¬ 
struction  seeks  to  construct  from  the  x[n)'s  d- 
dimensional  vectors  which,  when  embedded  in  Rd  de¬ 
scribes  the  full  dynamical  evolution  of  the  system.  Sec¬ 
tion  II  is  devoted  to  the  issue  of  identifying  the  correct 
value  of  d  from  the  data  set. 

For  the  moment  suppose  we  have  found  d  by  one 
means  or  another.  We  imagine  measuring  a  single  scalar 
variable  x  at  discrete  time  points  x(n )  for 
n  — 1,2, . . .  ,Nd.  (Observation  of  several  dynamical  vari¬ 
ables  from  the  system  is  even  better,  and  serves  to  pro¬ 
vide  confirmation  of  the  information  on  the  deductions 
from  observations  of  any  single  variable.)  We  can  con¬ 
struct  d-dimensional  vectors  y(n )  in  the  embedding  space 
by 


v (n)=(x(n  ),x(n  +t,),jc(ij  +r2), . . .  ,x[r.  +-«,_,}> , 


for  some  set  of  time  lags  tut2,  . . .  ,rrf_j.  The  set  of 
y(n  )’s,  of  which  we  have  N—ND—d,  capture  the  evolu¬ 
tion  of  the  nonlinear  system  under  observation  as  it 
moves  through  the  ddimensional  phase  space.  Familiar 
phase-space  coordinates  are  the  time  derivatives 
x(n  ),dx(n  )/dt,d2x{n  )/dt2, . . . ,  evaluated  at  discrete 
times.  The  data  on  x(n)  are  acquired  only  at  discrete 
times  and  establishing  the  values  of  these  derivatives  is 
certain  to  be  inaccurate.  The  time  lagged  x(n)’s,  which 


are  the  coordinate  elements  of  the  y(n  )’s,  are  nonlinear 
combinations  of  the  local  time  derivatives  and  are  fully 
acceptable  substitutes  for  the  usual  phase-space  coordi¬ 
nates.  This  has  been  emphasized  by  Eckmann  and 
Ruelle.10 

With  the  y(n  )’s  and  the  embedding  space  in  hand,  we 
ask  here  the  ambitious  question  of  how  we  can  use  the 
aeries  of  y(n)’s  to  predict  y(.W+l),y(Af+2),  etc. 
Equivalently,  we  can  ask  what  is  the  evolution,  under  the 
same  dynamical  system  that  produced  the  y(n)’s,  ~of  a 
point  y,  that  is  ort  the  attractor  but  not  in  the  original 
data  set.  We  will  have  answered  this  question  when  given 
a  data  set  y(  1  ),y(2), . . . ,  y(N),  we  have  identified  a  “reli¬ 
able”  map  F  from  Rd  to  itself  parametrized  by 
a=(o,,a2, . . .  ,aP)  which  takes  us  from  y(n )  to  y(n  +  1), 

y(n  +  l)=F(y(n),a)  . 

If  we  can  establish  a  reliable  F(y,a),  then  the  evolution 
of  a  point  y  in  Rrf  that  is  not  a  member  of  the  measur- 
ed  data  set  would  be  y-*-y1=:F(y,a),  y1-*-y2=F(y1,a) 
=  F(F(y,a),a)=F2(y,a),  etc. 

Our  first  view  of  the  data  y(  1  ),y(2), ....  y(fV)  is  that  it 
can  be  thought  of  as  a  pair  of  columns  of  vectors  in  Rd 


y(l) 

y(2) 

y(2) 

y(3) 

y(n) 

y(n  +  l 

yUV-1) 

y(AO, 

and  our  function  F(y,a)  comes  from  parametrically 
“fitting”  the  right-hand  column  of  y(n  +  l)  resulting 
from  the  left-hand  column  of  y(n ).  Fitting  the  data  then 
suggests  making  a  least-squares  estimation  of  a  so  that 
the  cost  function 


N-\ 

£(a)=  2 

n  *=*  1 


2  bm(«  +  l)-Fm(y(n),a)]7 

m 


is  minimized.  Our  approach  differs  from  previous  work 
in  detailed  tactics  and  in  our  imposition  of  important 
geometrical  structure  as  constraints  on  the  minimization 
of  the  cost  function.  The  articles  we  have  greatly  relied 
on  for  guidance  and  initial  impetus  in  our  research  are 
those  by  Farmer  and  Sidorovitch11  (we  refer  to  this  paper 
as  FS  in  the  following),  Lapedes  and  Father,  and 
Crutchfield  and  McNamara.13 

Our  main  point,  simply  stated,  is  that  we  are  not  just 
making  a  fit  to  data  with  a  set  of  functions  F(y,a).  Rath¬ 
er,  these  functions  evaluated  along  the  orbit  are  to  be  re¬ 
lated  to  each  other  in  the  manner  of  a  dynamical  system. 
This  leads  to  a  rather  different  view  of  the  fitting  func¬ 
tions  than  the  one  usually  taken  in  trying  to  match  data 
to  observations.  It  means  that  the  function  F(y,a)  evalu- 
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a  ted  on  the  data  vector  j(n )  i*  required  to  do  more  than 
reproduce  y(n  +  l)  as  accurately  as  possible.  F(y,a) 
must  also  be  a  function  which  when  iterated  will  repro¬ 
duce  yin  +2)  after  two  applications  to  y in )  and  y(n  +3) 
after  three,  etc.  The  notion  of  F(y,a)  as  a  dynamical  sys¬ 
tem  also  leads  to  modifications  of  the  cost  function.  The 
cost  function  should  reflect  the  fact  that  iterations  of 
F{y,a)  also  yield  points  on  the  orbit.  Furthermore,  under 
our  approach  geometrical  properties  of  the  dynamical 
system  given  by  F(y,a)  are  used  to  determine  the  success 
of  the  fit.  It  is  not  just  the  function’s  ability  to  reproduce 
in  a  least-squares  sense  the  observed  data  that  is  impor¬ 
tant.  The  data  contain  invariant  information  that  is 
essential  for  a  full  description  of  the  geometrical  struc¬ 
ture  of  the  attractor  that  it  evolves  on.  Our  key  observa¬ 
tion  in  this  article  is  that,  in  general,  least-squares  fitting 
alone  does  not  produce  a  map  that  captures  the  invariant 
characteristics  of  the  attractor  described  by  the  data 
yin),  n  =  \, . . .  ,N.  One  must  calculate  from  the  data  as 
many  of  these  invariant  quantities  as  possible  and  then 
impose  them  as  constraints  on  the  fit.  In  this  way  we  em¬ 
phasize  the  fact  that  one  is  creating  a  dynamics  and  not 
just  a  fit  to  data.  The  product  of  our  minimization  of  the 
constrained  cost  function  is  a  mapping  F(y,a)  of  to  it¬ 
self  which  is  not  only  reliable  in  that  it  reproduces  the 
given  data  set  by  having  a  small  cost  function,  but  is  also 
representational  in  that  it  has  the  same  geometric  invari¬ 
ants  as  the  underlying  dynamical  system.  The  methods 
for  identifying  those  invariants  and  utilizing  them  as 
classifiers  for  the  dynamical  system  is  a  matter  of  some 
importance  in  itself. 

The  invariants  are  properties  of  the  function  F(y,a) 
viewed  as  a  dynamical  system  which  maps  Rrf  to  itself. 
We  will  concentrate  on  two  kinds  of  invariants.  One  kind 
of  invariant,  the  Lyapunov  characteristic  exponents 
A.j,A.2,  . . . ,  \d,  describes  the  expansion  or  contraction  of 
phase-space  volumes  under  the  iteration  of  F(y,a).10-17 
Lyapunov  exponents  are  invariant  under  smooth  changes 
of  coordinate  and  are  independent  of  the  initial  condi¬ 
tions  of  the  orbit  one  follows  on  the  attractor.  The 
second  kind  of  invariant  is  the  density  of  points  on  the  at¬ 
tractor  pi y).  It  captures  global  features  of  the  frequency 
with  which  orbits  visit  various  portions  of  the  attractor. 
The  density  is  a  different  kind  of  invariant  than  the 
Lyapunov  exponents.  Its  integrals  with  smooth  functions 
Giy)  are  unchanged  under  operation  with  the  mapping 
'  function  which  underlies  the  dynamics  yin)-*yin  + 1 }, 

piy)Giy)=f  ddy  piy)GiViy,*)) . 

It  too  is  independent  of  the  initial  conditions  on  the  or¬ 
bits.1 'O'1*’19 

In  this  paper  we  find  the  parameters  a  in  F(y,a)  by 
minimizing  a  cost  function  subject  to  certain  constraints. 
The  constraints  are  chosen  to  insure  that  iterations  of  the 
mapping  function  F(y,a)  give  rise  to  values  of  dynamical 
invariants  which  are  the  same  as  those  indicated  by  the 
experimentally  measured  data  set  y in).  In  this  way 
essential  geometric  information  about  the  particular  at¬ 
tractor  on  which  the  data  live  will  be  built  into  the  para¬ 


metric  mapping.  Straightforward  least-squares  minhuj 
zation  does  not  accurately  reproduce  these  invariant 
Thus  one  must  perform  a  least-square  minimization  refer 
ject  to  the  constraints  that  F(y,a)  accurately  produce  tltt 
Lyapunov  spectra  A1(A.2, . . .  ,A.rf  and  the  invariant  dentil 
ty  p(y ).  This  paper  is  devoted  to  explaining  in  detail  ho# 
one  implements  the  idea  just  stated.  ^ 


B.  Cb coring  maps  and  predictor* 


“  •‘CV'X 


Assuming  for  the  moment  that  we  have  successfully  ’ 
embedded  the  data  xin )  in  Rrf  by  creating  d-dimensional  d 
vectors  yin ),  n  =  1, . . .  ,N.  We  need  to  choose  a  class  of.^ 
parametrized  mappings,  a  cost  function  to  minimize,  and.’ 
a  means  to  impose  the  constraints  on  our  minimization*  <3 
The  maps  must  have  some  way  of  fitting  the  data  by}j 
closely  reproducing  one  data  point  from  the  previous  one  j 
by  y ( n  4- 1 ) «  F(y(  n ),  a ).  Our  maps  are  required  to  “look  • 
around”  at  the  behavior  of  the  phase-space  neighbors  of°! 
the  point  y(n)  and  predict  forward  according  to  how  a  • 
cluster  of  phase-space  neighbors,  regardless  of  their  tern-  i 
poral  sequence,  are  moved  forward  in  time.  The  idea 
here  is  that  one  may  use  knowledge  of  the  behavior  of  lo¬ 
cal  regions  of  phase  space  as  well  as  past  points  on  an  or¬ 
bit  to  determine  where  a  point  will  be  mapped  in  the  tem¬ 
poral  future.  The  maps  we  choose  must  then  be  sensitive 
to  their  neighborhood  in  phase  space  and  must  inquire 
about  the  fate  of  any  spatial  neighbor  under  the  map 
without  concern  of  its  temporal  arrival  in  the  neighbor¬ 
hood.  The  map  will  then  try  to  take  any  new  point  y  and 
map  it  forward  to  some  weighted  average  of  its  neigh¬ 
bors’  forward  evolution. 

We  take  our  mappings  to  be  of  the  form  „  . 


F(y,a)=  2  y(n  +  l)g(y,y(n);a) , 

ft  «  1 


where  g(y,y(n);a)  is  near  1  for  y=y in),  and  vanishes 
rapidly  for  nonzero  |y— y(n  )|;  the  vertical  bars  represent 
some  norm,  in  our  case  Euclidean,  in  Rd.  Fiy(k),a)  will 
then  be  quite  close  to  y( k  + 1 ). 

This  type  of  mapping  is  strongly  suggestive  of  the  form  ' 
used  in  the  statistical  literature  under  the  name  of  kernel 
estimation  or  kernel  density  estimation.  An  explicit  re* 
cent  example  that  illustrates  the  similarity  is  found  ift 
Ref.  20.  Other  useful  discussions  of  this  method  applied 
to  various  problems  are  to  be  found  in  Refs.  21  and  22; 
our  attention  was  directed  to  this  similarity  by  Farmer 
and  Sidorowich.23  We  do  not  claim  to  have  a  better 
method  for  choosing  our  function  g  than  those  in  the 
literature,  but  our  motivation  here  does  differ  from  all  the 
citations  except  Rice.20  Our  constraints  on  g  are  also 
different,  but  could  be  modified.  For  example,  the  in* 
tegral  of  g  over  y  need  not  be  unity,  nor  do  we  require 
that  g  be  positive.  We  will  return  to  a  discussion  of 
choices  for  g  in  our  summary  in  Sec.  VI. 

Our  choice  here  for  giy, yin  );a) — one  among  many,  of 
course — is  this: 
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.exp[Hy— y(n)l:/a]  a,  +a2y(n  )*(y— 


g(y,y(«  );*)=- 


A'-rl 

2  exp[-|y-y(n)|2/a]  a,+ 


y(n))+  2  a*<ly-y(n)|2/CT)m* 

_ i-3 _ 

2  a*(!y-y(«)l2/a)m* 

k-2 


The  parameter  space  a  is  P  dimensional, 
a=(a|,a2, . . .  ,aP).  a  is  a  fixed  parameter  that  provides 
a  scale  we  can  use  to  determine  which  points  in  the  data 
set  are  “close”  to  y.  The  mk’s  are  also  fixed  at  various 
values.  We  could  treat  both  a  and  the  mk‘s  as  parame¬ 
ters  to  be  optimized  in  the  same  sense  as  the  a*s.  Howev¬ 
er,  we  choose  not  to  do  this  in  our  work;  not  for  any  fun¬ 
damental  reasons,  but  because  we  wished  to  explore  other 
issues  and  wished  to  keep  down  the  size  of  the  parameter 
space  over  which  our  minimization  searches  were  per¬ 
formed. 

The  weight  function  g(y,y(/r);a)  which  we  use  was  ar¬ 
rived  at  after  some  experimentation.  It,  as  do  many  other 
choices,  certainly  satisfies  our  general  requirements. 
These  requirements  include  the  following: 

The  function  is  sensitive  to  the  presence  of  near 
“neighbors”  in  phase  space.  Only  points  y(n)  within  a 
distance  from  y  of  order  v'a  make  any  sizable  contribu¬ 
tion  tog(y,y(n);a). 

When  o-vO,  g(y,y(n);a)  becomes  essentially  a 
Kronecker  delta  and  the  point  y(n)  is  mapped  precisely 
to  y(n  +  l). 

It  is  easy  to  differentiate  both  in  y  and  in  a.  These 
derivatives  are  important  in  the  minimization  of  the  cost 
function  using  our  methods,  and  having  explicit  expres¬ 
sions  for  the  required  derivatives  in  either  of  these  in¬ 
dependent  variables  makes  the  optimization  routines  run 
much  faster. 

In  the  function  we  have  chosen  it  is  easy  to  retain 
many  parameters  all  of  the  same  general  form,  thus  as  the 
number  of  constraints  on  the  optimization  of  the  cost 
function  is  increased,  the  pattern  of  our  searches  remains 
the  same. 

The  essential  function  which  senses  neighbors,  namely 
the  exponential,  can  easily  be  replaced  by  other  choice.,, 
such  as  those  in  Table  3.1  of  Silverman’s  monograph.21 
The  general  form  of  our  arguments  goes  through  then 
without  modification. 

By  virtue  of  the  term  involving  a2  in  the  numerator, 
this  form  of  g(y,y(n  );a)  allowed  us  to  satisfy  constraints 
set  by  the  Lyapunov  exponents  with  numerical  stability 
and  accuracy.  The  denominator  serves  as  an  approxi¬ 
mate  counter  for  the  number  of  neighbors  of  the  point  y, 
so  the  numerator  works  less  to  produce  the  required  aver¬ 
age  for  the  forward  prediction  of  the  point  y.  The  pres¬ 
ence  of  the  denominator  assured  us  of  numerical  ease  in 
making  the  parameters  in  the  map  F(y,a)  meet  our  re¬ 
quirement  of  producing  an  average  over  neighborhood 
points  in  projecting  forward  in  time  any  phase-space 
point.  This  made  the  numerical  algorithms  we  use  much 
more  efficient  and  accurate. 

The  choice  of  cost  function  is  also  rather  much  up  to 
us.  Since  we  are  to  think  of  F(y,a)  as  a  dynamical  system 


evolving  points  y(n )  into  new  points  y(n  + 1 ),  we  should 
consider  asking  the  map  to  reproduce  accurately  from 
y(n )  not  only  the  “next”  point  y(n  + 1 )  but,  via  iteration, 
a  sequence  of  points  y(n  +  l),y(n+2),y(n+3), 
. .  .,y(n  +  L )  up  to  some  L  beyond  which  we  simply  do 
•not  trust  the  accuracy  of  our  algorithm  F  or  of  the 
machines  we  use  to  compute  the  future  y’s. 

This  suggests  the  predictor  for  future  points  to  be  a 
linear  combination  of  iterated  powers  of  the  map  F(y,a), 

y(m  +  l)=  2  XkFk(y(m -Jfc  +  l),a)  ,  (3) 

*-i 

where  F  is  the  kth  iterate  of  F  as  described  above.  If 
F(y,a)  were  the  exact  mapping,  then  each  term  in  the 
sum  over  k  would  be  Xky(m  + 1 ).  Thus  we  require 

2^  =  i. 

The  X’s  weight  the  various  iterates  of  F  and  are  used  to 
determine  which  iterates  of  F  we  believe  are  the  most  ac¬ 
curate.  Typically,  one  would  require  Xj>Xj+  ,  to  indi¬ 
cate  that  the  lower  iterates  of  F  are  believed  to  be  more 
accurate  than  the  higher  iterates.  This  predictor  is  a  nat¬ 
ural  generalization  to  the  nonlinear  problem  of  the  com¬ 
mon  linear  predictor 
L 

y(m  +  1)=  2  Xky(m  —k  + 1 )  , 

k 

with  the  clear  differences  associated  with  the  iterative  na¬ 
ture  of  the  map  F(y,a). 

This  predictor  [Eq.  (3)]  combines  both  past  temporal 
information  from  times  m  —k  + 1;  k  =  1,2, . . .  ,L  and  in¬ 
formation  from  all  the  phase-space  neighbors  of  the  orbit 
points  y(m— k  +  1)  because  of  the  structure  of  F(y,a). 
The  combination  of  spatial  and  temporal  information 
provides  a  significant  "lever  arm”  which  permits  Eq.  (3) 
to  quite  accurately  make  forecasts  about  the  forward  evo¬ 
lution  of  points  y  in  Rd.  By  utilizing  the  phase-space  in¬ 
formation  in  F(y,a)  at  each  temporal  step  we  efficiently 
tap  properties  of  the  full  data  set. 

The  cost  function  associated  with  this  predictor  is 


C(X,a)= 


2*  ly(»  +  D-  2  X*F*<y(n-*  +  l),a)|2 

n  'i.  k  “  1 

2  ly(«)-y(«)l2 

n  1 


This  kind  of  cost  function  will  automatically  contain  in¬ 
formation  on  the  Lyapunov  exponents  which  themselves 
are  expressions  of  the  dynamics  as  iterations  of  the  map. 
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Some  information  on  the  invariant  density  function  on 
*the  attractor  is  also  contained  in  this  improved  cost  func- 
•  tion.34 

Another  major  consideration  to  us  is  the  great 
difference  in  the  coordinate  scale  of  various  attractors. 
The  numerator  of  the  cost  function:  [Eq.  (4)]  is  the  residu¬ 
al  of  the  mapped  function  summed  over  the  entire  trajec¬ 
tory,  and  hence  gives  a  measure  of  the  sum  of  the  abso¬ 
lute  errors  over  all  the  mapped  points.  Since  the  absolute 
error  is  obviously  dependent  on  the  macroscale  of  the  at¬ 
tractor,  it  i3  more  informative  to  rescale  the  final  cost 
function  value  in  some  manner  which  reflects  error.  In 
our  samples,  the  scale  of  the  attractor  of  the  Henon  at¬ 
tractor  is  on  the  order  of  unity,  while  that  of  the  Lorenz 
attractor  is  of  order  100.  Hence  some  form  of  rescaling 
of  the  cost  function  became  desirable  in  order  to  have  a 
relative  measure  of  comparison  between  two  systems  with 
different  macroscale.  We  chose  a  normalization  in  the 
following  straightforward  manner:  we  simple  summed 
the  magnitudes  of  the  position  vectors  of  all  the  points  on 
the  attractor,  and  retained  this  value  as  a  constant.  Ab¬ 
solute  values  for  the  cost  function  after  normalization  by 
the  denominator  in  Eq.  (4)  give  a  more  sensible  relative 
measure  of  the  error  of  our  prediction  function  F(y,a). 

We  note  that  FS  suggest  forecasting  the  evolution  of  a 
point  y  by  looking  around  at  the  neighbors  of  y  among 
the  data  set  y(n )  and  observing  where  these  neighbors  go 
under  one  iteration  of  the  underlying  map  talcing  the 
y(n)  to  y(n  +  l).  They  determine  the  future  of  the  new 
point  y  by  an  interpolation  involving  the  future  of  its 
neighbors.  Our  mapping  function  Eqs.  (1)  and  (2)  does 
precisely  this  as  indicated.  AH  points  in  the  data  set  are 
given  some  weight  in  the  future  of  y,  but  if  g(y,y(n);a) 
falls  rapidly  for  large  |y— y(n)|,  as  We  shall  always 
choose,  only  members  of  the  data  set  y(n )  near  y,  i.e.,  the 
neighbors,  play  much  of  a  role  in  its  future.  Our  F(y,a) 
in  that  sense  is  an  analytic  formulation  of  the  FS  idea. 
More  or  less  weight  can  be  given  to  the  near  neighbors  by 
different  choices  for  the  function  g(y,y(n);a).  The 
Gaussian  we  work  with  could  be  replaced  by  a  Lorentzi- 
an  or  other  choices  which  weight  neighbors  more. 

C.  Invariants 

With  a  inap  and  a  cost  function,  Eqs.  (1),  (2),  and  (4), 
we  are  ready  for  the  constraints.  Section  III  is  devoted  to 
a  discussion  of  Lyapunov  exponents.  In  it  we  first  turn  to 
the  extraction  of  the  Lyapunov  exponents  A.I(A.2, . . .  ,Xd 
from  the  data  y(l),y(2), . . .  ,y(N).  We  do  not  add  any¬ 
thing  but  our  own  experience  to  that  of  many  workers 
who  have  explored  the  calculation  of  X,  from  data.  We 
attempt  to  convey  to  the  reader  an  overview  of  the  avail¬ 
able  methods  for  determining  Lyapunov  spectra  and  a 
sense  of  their  reliability.  Therefore  that  portion  of  Sec. 
Ill  may  be  skipped  by  persons  with  experience.  We  in¬ 
clude  it  here  since  determining  the  Xt's  is  an  essential  step 
in  our  plan  for  determining  Fly, a)  and  we  have  chosen  to 
comment  on  how  we  have  done  it  rather  than  refer  the 
reader  to  the  literature.  Of  course,  we  do  that  too.  That 
established,  we  discuss  how  to  determine  these  numbers 
in  terms  of  the  Fly, a).  Equating  the  numerical  values  for 


Xf  from  the  data  to  their  expression  in  terms  of  paramo*! 
ten  a  in  Fly, a)  will  constitute  our  first  set  of  constraints ij 
aa  the  minimization  of  C(X, a). 

Section  IV  contains  our  discussion  of  the  invariant 
tribution  of  points  on  the  attractor.  In  principle,  this  -i 
quantity,  which  we  called  ply),  contains  an  infinite-*; 
amount  of  information  on  the  dynamics.  A  finite  data  aetH 
yin )  restricts  the  resolution  we  have  of  this  information. 
We  have  chosen  to  express  this  finite  amount  of  informa-  • 
tion  in  terms  of  the  projection  of  ply)  on  a  set  of  dual 
basis  functions  which  are  a  complete  set  in  Rd.  Keeping 
a  finite  number  of  these  functions  is  equivalent  to  a  finite 
resolution  view  of  the  complex  structure  of  ply ). 19 

One  of  our  contributions  in  this  work  is  a  scheme  for 
choosing  the  dual  basis  functions  "tuned”  to  the  struc¬ 
ture  of  ply).  This  allows  us  to  represent  our  finite  resolu¬ 
tion  of  ply)  by  a  small  number  of  terms  in  an  expansion 
in  the  optimal  basis  functions.25-21  By  projecting  the 
ply)  determined  from  the  data  onto  these  basis  functions, 
we  can  determine  the  coefficients  of  the  expansion  of  ply) 
in  this  basis.  Similarly,  we  can  project  the  ply)  deter¬ 
mined  from  the  map  Fly, a)  onto  these  basis  functions 
and  determine  the  expansion  coefficients  of  the  map. 
Equating  the  coefficients  one  determines  from  the  data  to 
the  ones  determined  from  the  map  constitutes  our  final 
constraints  on  the  minimization  of  C1X, a).  Further¬ 
more,  we  show  how  the  components  of  ply),  in  this  basis, 
are  the  elements  of  the  eigenvalue  unity  eigenvector  of  a 
finite-dimensional  matrix  constructed  from  Fly, a)  and 
the  dual  basis  functions. 

In  Sec.  V  we  describe  our  implementation  of  the  con¬ 
strained  minimization  program2®  for  two  model  systems: 
the  Henon  map  of  the  plane  to  itself  and  (2)  the  Lorenz 
system.  In  each  case  we  numerically  generate  a  data  set 
of  x(n  )’s.  We  then  discuss  in  some  detail  our  experience 
in  establishing  the  dimension  of  the  space  in  which  the 
dynamics  is  embedded.  We  also  discuss  the  calculation 
of  Lyapunov  exponents,  and  aspects  of  the  invariant  dis¬ 
tribution  on  the  attractor  from  the  y(n)’s.  Finally,  we 
carry  out  the  constrained  minimization  of  the  cost  func¬ 
tion  and  indicate  how  well  our  parametrized  mappings 
are  able  to  perform  in  predicting  orbits  other  than  those 
in  the  given  data  set. 

In  this  paper  we  are  attempting  to  describe  a  method 
of  analyzing  experimental  data.  For  such  a  situation  we 
do  not  know  a  priori  the  correct  embedding  dimension, 
the  correct  Lyapunov  exponents,  or  the  underlying 
dynamical  system  that  can  be  used  to  generate  the 
correct  invariant  distribution.  Yet  we  have  used  data  sets 
generated  by  a  dynamical  system  that  we  know.  We  have 
used  known  systems  for  two  reasons.  The  first  is  that  it 
provides  a  simple  way  to  obtain  large,  noise  free,  data 
sets.  Second,  it  provides  a  way  of  measuring  how  well  ex¬ 
isting  techniques  are  able  to  determine  the  embedding  di¬ 
mension  and  the  Lyapunov  exponents.  In  order  to  simu¬ 
late  experimental  systems  we  treat  the  data  set  as  having 
come  to  us  from  an  unknown  source.  Thus  we  do  not  use 
any  of  the  known  properties  of  cither  the  Henon  or  the 
Lorenz  system. 

An  issue  of  some  importance  we  do  not  address  in  this 
paper  is  that  of  extrinsic  noise  which  could  contaminate 
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_  our  sign*]  xin ).  This  it  not  a  dismissal  of  this  important 
issue  but  an  attempt  to  separate  out  the  matters  of 
efficiency  and  utility  of  our  plan  for  prediction  on  strange 
attractors  from  issues  concerning  the  practical  degrada¬ 
tion  of  our  procedures  by  external  noise.  An  equally  im¬ 
portant  issue  is  the  quantity  of  data  available.  The  deter¬ 
mination  of  Lyapunov  exponents  is  very  difficult  for  short 
data  sets.  As  we  have  stated  above  the  resolution  of  p{y) 
is  determined  by  the  number  of  data  points  available.  As 
the  dimension  of  the  phase  space  increases,  the  amount  of 
data  necessary  for  accurate  prediction  increases  dramati¬ 
cally.'  We  will  return  to  the  implications  of  noisy  and/or 
short  data  sets  for  our  prediction  procedure  in  later 
work.  For  now  we  assume  that  we  are  given  essentially 
noise-free,  arbitrarily  long  time  series  x(n ). 

It  is  our  expectation  that  our  experiences  with  the  two 
systems  listed  above  will  give  us  the  ability,  in  many  in¬ 
stances,  to  construct  models  in  the  form  of  our 
parametrized  mapping  F(y,a)  which  allow  prediction 
and  control  of  the  underlying  nonlinear  dynamical  sys¬ 
tem  producing  an  observed  signal  x{n).  The  details  of 
the  F(y,a)  for  a  specific  application  should  reflect  the 
known  features  of  the  physical  or  other  phenomena  giv¬ 
ing  the  signal.  It  seems  too  bold,  if  at  all  possible,  to  sug¬ 
gest  any  general  rules  for  choosing  forms  for  F(y,a). 
This  is  sure  to  be  a  rich  area  for  experimentation  and  our 
own  choice  will  be  motivated  by  considerations  we  shall 
defend  in  a  later  section  and  slightly  alluded  to  above. 

The  matter  of  noise  will  be  addressed  in  a  future  paper. 
Our  methods  for  dealing  with  noise  follow  those  outlined 
by  Fuller29  and  seem  similar  to  the  ideas  of  Sidorwich.30 

n.  CHOICE  OF  AN  EMBEDDING  SPACE 

In  this  section  we  illustrate  how  or.e  can  determine  the 
phase-space  embedding  dimension  d  from  the  scalar  time 
series  xin ),  n  =  1, . . . , ND.  We  assume  that  the  data  set 
is  long  enough  that  we  need  not  be  concerned  with  sta¬ 
tistical  issues  about  the  numerical  accuracy  of  the  quanti¬ 
ties  we  consider  below.  We  also  assume  extrinsic  noise  is 
absent  from  the  xin  )’s  when  we  receive  them.  Matters  of 
short  and/or  noisy  data  sets,  while  critical  in  all  applica¬ 
tions,  are  addressed  only  peripherally  in  this  paper. 

Following  the  work  of  Packard  et  a /.5  and  Mafic  and 
Takens6,7  and  the  developmental  work  of  numerous  oth¬ 
ers  we  seek  a  set  of  lagged  variables  xin  ),x(n  +tx), 
xin  +r2), . . .  ,xin  +rd_|)  which  act  as  the  coordinates 
in  a  d-dimensional  space  in  which  the  dynamics  produc¬ 
ing  the  xin  )’s  is  fully  captured  or  embedded. 

The  choice  of  lags  ra  is  not  a  well  agreed  upon 
matter.31  The  issue  is  the  accuracy  and  efficiency  with 
which  the  d-dimensional  vectors  that  result  from  a  par¬ 
ticular  choice  of  r0’s  represents  the  phase  space  in  which 
the  attractor  resides.  If  the  underlying  system  were  a 
differential  equation  and  a  scalar  variable  xit)  were  mea¬ 
sured  at  discrete  times  xin  )=x(:0+nAt ),  then  we  are  by 
the  choice  of  lagged  variables  trying  to  find  a  discrete  re¬ 
placement  for  the  usual  phase-space  coordinates 
x(t),dx  /dt, . . .  ,dd~lx  /dtd~l.  Mafle  and  Taken’s  re¬ 
sults  indicate  that,  in  principle,  any  choice  of  lags  r0  will 
do.  We  adopt  the  practice  of  choosing  a  single  lag  r  and 


making  all  other  lags  multiples  of  r.  The  question  of 
what  is  the  best  way  to  choose  r  is  still  open.  In  a  heuris¬ 
tic  sense,  if  r  is  too  small,  then  the  coordinate  at  xin  +r) 
and  jc(n+2r)  represent  almost  the  same  information. 
Similarly,  if  r  is  too  large,  then  x(n+r)  and  jc(n+2r) 
represent  distinct  uncorrelate  descriptions  of  the  embed¬ 
ding  space. 

For  reasons  of  consistency  and  ease  in  calculating 
Lyapunov  exponents  (cf.  Sec.  Ill)  we  adopt  the  following 
practice.  We  take  the  original  scalar  measurements  and 
calculate  its  autocorrelation  function 

4;  fTx(t+r)xit)dt  . 

We  then  choose  r  to  be  approximately  to  ^  the  time 
associated  with  the  first  local  minimum  of  the  autocorre¬ 
lation  function.  We  find  that  this  system,  although  some¬ 
what  arbitrary,  works  well  in  practice  and  provides  a  sim¬ 
ple  and  systematic  way  of  determining  r.  We  set  r  to  uni¬ 
ty  and  thereby  establish  a  time  scale  for  the  problem. 
The  data  xin),  n  =  l, . . .  ,ND  thus  become  measure¬ 
ments  of  the  scalar  variable  separated  by  a  constant  time 
step  t. 

We  then  form  d  vectors 

y(n)=(x(n),x(n  +  I), . . .  ,xin+d  —  1))  (5) 

for  n  =  1,2, . . . ,  N=ND—d  in  a  space  Rd  capturing  the 
geometric  structure  of  the  attractor  on  which  the  orbits 
x in)  lie.  To  establish  d  we  need  some  characteristic  of 
the  attractor  that  becomes  unchanging  as  d  becomes 
large  enough,  thus  indicating  that  the  attractor  can  be 
embedded  in  Rd.  The  usual  Hausdorff  or  other  dimen¬ 
sions  of  the  attractor  are  such  characteristic  quantities. 
Numerical  calculations  of  the  Hausdorff  dimension 
dAiN,d)  of  an  attractor  may  depend  on  the  finite  length 
of  the  data  set  N  and/or  the  embedding  dimension  d.  For 
N  large  enough  dA  will  become  independent  of  d  when 
the  attractor  is  properly  embedded  in  Rd.  Operationally 
one  increases  d  until  dA  remains  constant  and  identifies 
the  minimum  d  where  dA  “saturates”  as  the  embedding 
dimension. 

In  fact,  we,  along  with  numerous  others,  do  not  recom¬ 
mend  the  computation  of  dA,  however  geometrically  ap¬ 
pealing  it  may  be,  because  it  is  too  demanding  of  comput¬ 
er  time.  We  suggest,  and  we  use,  the  properties  of  the 
correlation  function  Dir),  proposed  by  Takens32  and  by 
Grassberger  and  Procaccia,33  which  is  much  easier  to 
compute.  In  terms  of  the  data  vectors  y(n)  this  is 
defined  to  be 

oir.A'.d)^-— 2_— 1  |e(,-|y(;)-y(.)l) , 

i¥=j  (6) 

where  Qix)  is  the  Heaviside  function  0(x>O)=l  and 
Oix  <0)=0.  The  vertical  bars  represent  some  measure 
of  distance  in  Rd — we  use  the  Euclidean  norm,  but  that  is 
only  a  convenient  choice.  This  correlation  function 
counts  the  points  of  the  attractor  within  a  distance  r  of 
each  other.  Thus  it  possesses  much  of  the  same  geometri- 
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cal  content  u  the  Hausdorff  or  other  invariant  dimension 
attributes. 

For  N  large  enough  the  behavior -of  D(rtN,d)  for 
small  r  becomes  independent  of  N.  As  one  would  expect 
from  scaling  arguments  about  fractals,  as  well  as  observa- 
tionally, D(r,H,d)  is  seen  to  take  the  form  ■  *  ■ 

D{r,N,d)=<tHr,d)r«d'  v/,;,  .  ■■  ,  , 
for  small  rand  larged. 35 

We  will  identify  as  the  embedding  dimension  that  value 
of  d  where  the  structure  in  D(r,N,d)  becomes  indepen¬ 
dent  of  d.  In  this  regime  it  is  sufficient  that  D(r,N,d), 
becomes  independent  of  d  over  a  range  of  r  near  r-*0, 
and  large  N  [r=0  in  a  finite  data  set  always  gives  strictly 
zero  for  D(r,N,d )  and  is  an  uninteresting  limit]. 

To  illustrate  the  use  of  the  correlation  function  as  an 
embedding  dimension  discriminant  We  have  calculated 
D(r,N,d)  for  very  long  time  series  taken  from  the  two 
examples  we  will  be  working  with  in  this  paper:  (i)  the 
Henon  map  of  the  plane  to  itself,36 

X|(«  +  l)=1.0— ax,(n )2+Ar2(n )  , 

(7) 

x2(n  +  l)=hx1(n)  , 

with  conventional  parameter  values  a  =  1.4  and  6=0.3, 
and  (ii)  the  Lorenz  system  of  three  autonomous 
differential  equations37 

dx,(r) 

— —  =<r(x2U)-x,U))  , 

dx2(t ) 

— — —  =  -Xi(r)x3(/)+rx,(t)— x2(t)  ,  (8) 

dx2(t) 

— — — =x](t)x2(t)-bx2(t)  , 

with  parameter  values  or  =  16,  6=4,  and  r=45.92. 

For  the  Henon  map  we  took  an  initial  condition  lying 
in  its  basin  of  attraction  and  iterated  the  map  4550  times. 
The  first  50  iterates  were  discarded  as  representing  tran¬ 
sient  behavior,  while  the  last  4500  points  of  *,(«)  and 


FIG.  1.  D(R)  vs  r  for  the  Henon  map.  y,(n)=(jc,(/i), 
*i(n  +1))  for  4500  points. 

vectors  yi(n)=(xI(n),^1(n-l-l)),  we  reconstruct .  the 
figure  seen  in  Fig.  2.  This  is,  as  should  not  be  surprising 
in  this  simple  example,  a  rotated  form  of  the  Henon  at¬ 
tractor.  The  usual  display  of  the  Henon  attractor  is  ob¬ 
tained  by  plotting  (jc,(/j  ),x2(n ))  for  our  data.  Since 
x ,  ( n )  is  ( 1  /6  )x  2  ( n  + 1 ),  the  coincidence  of  these  plots  is 
certainly  not  remarkable.  Our  goal  in  presenting  this 
kind  of  detail  is  as  a  guide  to  what  one  might  expect  in 
more  complicated  examples  rather  than  as  revelations 
about  the  Henon  map. 

Next  we  turn  to  the  Lorenz  equations.  Once  again  we 
chose  initial  conditions  in  the  basin  of  attraction  and 
solved  ’Eqs.  (8)  with  a  straightforward  fourth-order 
Runge-Kutta  ordinary  differential  equation  (ODE)  solver 
with  a  fixed  time  step.  After  discarding  the  first  50  time 
steps  as  transients,  we  recorded  x2,  and  x2  for 
N= 4500  corresponding  to  many  natural  cycles  of  the  or¬ 
bit  around  the  attractor.  From  each  of  the  three  data 
sets  we  formed  the  d  vectors  as  in  Eq.  (5)  and  with  them 
evaluated  the  correlation  function  D(rtN,d)  for 
d  =  1,2 . 5.  The  D(r,N,d  )’s  for  y,(/t )  data  are  shown 


x2(n  )  were  then  used  to  make  d  vectors 

y/(n)=(x,.(n),x,(n  +  1), . . .  ,x,{n  +d~  1)) 

for  f  =  l  or  2.  Ford  =  l,2, ...  ,5  D(r,N,d)  was  comput¬ 
ed  using  an  efficient  code  developed  by  Theiler.34 
For  y ,(n)  data  these  D{r,N,d)  are  plotted  in  Fig.  1.  A 
similar  plot  was  generated  for  y 2(n),  but  is  not 
shown.  Because  of  the  simplicity  of  the  connection 
x2(rt  + 1  )=bx,(n  )  in  the  Henon  map,  these  iwo  views  of 
D(r,N,d)  are  really  redundant.  However,  in  the  spirit  of 
treating  each  data  series  as  having  originally  come  to  us 
from  a  source  whose  underlying  dynamics  is  unknown  we 
performed  both  calculations. 

While  a  cautious  and  careful  observer  might  say  the 
embedding  dimension  for  the  y^n )  data  would  be  d— 3, 
we  feel  safe  in  concluding  from  these  figures  that  d—  2. 
Computations  with  N  greater  than  4500  support  this  con¬ 
clusion. 

Further,  if  we  take  the  Jtj  (n)  data  and  plot  the  two- 
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FIG.  2.  Henon  attractor  xx(n)  plotted  against  x^n  + 1 ). 
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k  Fig.  3.  An  embedding  dimension  of  d =3  is  fairly 
clearly  a  safe  choice  for  these  data.  A  bolder  choice 
would  have  been  d  = 2.  Since  it  is  known  that  the  Haus- 
dorff  dimension  of  the  Lorenz  attractor  is  just  above  2  in 
this  regime  of  parameter  space,  this  would  have  been  a 
convincing,  Although  incorrect,  choice.  The  message 
here  is  that  choosing  d  too  large  entails  extra  subsequent 
computation,  but  no  loss  of  information  on  the  attractor. 
It  is  probably  safer  to  live  with  a  d  one  dimension  too 
large  as  a  general  matter  of  care.  We  thus  choose  d  =  3. 
The  results  of  the  y2  and  y3  data  are  not  shown.  As  with 
the  Henon  example  the  results  of  y2  and  y3  are  similar  to 
those  of  y,.  The  fact  that  the  y^  y2,  and  y3  vectors  for 
the  Lorenz  data  (y3  and  y2  for  Henon)  yields  similar  re¬ 
sults  is  to  be  expected  since  all  three  measurements 
evolve  on  the  same  attractor. 

Next  we  plot  the  points  y1(n)=(x1(n),x,(«  +  l), 
x^n+2))  in  the  three-dimensional  embedding  space. 
These  are  shown  in  Fig.  4  as  a  projection  on  a  plane 
with  normal  vector  n=(cos0)i1(n  l+isinflji^n  + 1) 
+0x1(n+2)  for  0=1.31.  We  note  the  similarity  be¬ 
tween  Fig.  4  and  the  well-known  shape  of  the  Lorenz  at¬ 
tractor.  Thus  the  method  of  phase-space  embedding  reli¬ 
ably  reproduces  the  Lorenz  attractor.  For  the  two  exam¬ 
ples  we  have  used  the  reconstructed  attractor  is  similar  in 
appearance  to  the  attractor  generated  by  the  “true"  un¬ 
derlying  equations  of  motion.  In  general,  the  recon¬ 
structed  attractor  will  not  have  this  visual  similarity. 
However,  the  reconstructed  attractor  will  contain  all  of 
the  important  invariant  information  as  the  true  attractor. 
The  difference  in  visual  shapes  is  the  result  of  a  nonlinear 
change  of  variables  between  the  true  dynamical  variables 
and  the  reconstructed  variables. 

We  close  this  section  with  a  summary  note  reminding 
the  reader  that  our  use  of  the  correlation  integral  Eq.  (6) 
has  been  to  establish  an  embedding  dimension  d  in  which 
to  view  the  system  attractor  described  by  our  time  series 
x(n ).  Wc  chose  D(r,N,d)  because  it  is  familiar,  easy  to 
compute,  and  has  a  clear  geometrical  meaning.  For  us  it 


FIG.  3.  D(r)  vs  r  for  the  Lorenz  equations,  y,(n)  for  4500 
points  and  embedding  dimensions  d  =  l,...5.  For  this  case 
r  — 45.92,  b— 4.0,  and  a  =  16. 
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FIG.  4.  Lorenz  attractor  created  from  x,(n)  data.  The  pa¬ 
rameter  values  are  r  =45.92,  b  =4.0,  and  a  =  16,  while  the  pro¬ 
jection  angle  is  6=  1.31. 


is  a  diagnostic  tool.  While  the  details  of  the  small  r  be¬ 
havior  D(r,d)~rv«J>(r)  contains  important  information 
about  the  dynamics,  we  do  not  focus  on  that  here. 
Indeed,  we  are  quite  happy  to  accept  other  diagnostic 
tools  in  its  place. 

* 

III.  LYAPUNOV  CHARACTERISTIC 

EXPONENTS— FROM  DATA  AND  FROM  THE  MAP 

In  this  section  we  discuss  how  one  determines  the 
Lyapunov  exponents  that  govern  a  dynamical  system. 
First  we  discuss  how  to  extract  them  from  an  experimen¬ 
tal  data  set  and  then  from  our  mapping  F(y,a).  By 
choosing  the  parameters  a  in  such  a  way  that  F(y,a) 
yields  the  same  Lyapunov  exponents  as  the  experimental 
data  set,  we  are  forcing  a  constraint  on  F(y,a)  that  is  not 
explicitly  required  by  minimizing  the  cost  function  given 
by  Eq.  (4).  This  local  constraint  should  improve  our  abil¬ 
ity  to  predict  the  short-term  (and  possibly  long-term)  evo¬ 
lution  of  points  that  are  not  in  the  data  set,  but  near  the 
attractor.  Certainly  points  outside  the  basin  of  attraction 
of  the  attractor  we  have  observed  in  the  original  data  set 
will  not  evolve  according  to  our  F(y,a). 

Rather  than  writing  our  own  computer  program,  and 
thereby  become  embroiled  in  the  controversy  of  what  is 
the  best  way  to  determine  Lyapunov  spectra  from  an  ex¬ 
perimental  time  series,  we  have  chosen  to  use  methods 
that  have  already  been  proposed  by  two  different  research 
groups.  By  comparing  the  results  of  both  methods  we 
hope  to  improve  our  confidence  in  the  spectra  given  by 
each  of  them  separately.  The  first  method  we  shall  report 
on  was  developed  by  Eckmann  et  al.3i  The  second 
method  was  developed  by  Wolf  et  al?9  Finally,  we  will 
show  how  we  calculated  the  Lyapunov  spectra  from  our 
mapping  F(y,a). 

The  choice  of  an  appropriate  data  set  for  use  in  either 
the  Eckmann  et  al.  or  the  Wolf  et  al.  method  is  some- 
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thing  that  cannot  be  overstressed.  As  stated  in  Sec.  II  the 
time  lag  r  between  successive  measurements  of  the 
-dynamical  variable  must  be  appropriately  chosen,  if  one 
wants  optimal  results. 

A.  Ecfanmnn-Kamphorit-Rnelle-CIlIberto  Method 

For  'vc  Eckmann  et  al.  method  the  fortran  source 
code  we  used  when  performing  our  numerical  experi¬ 
ments  on  the  dynamical  systems  denoted  in  Sec.  II  was 
provided  by  the  authors  of  Ref.  38.  It  assumes  that  the 
input  is  a  string  of  positive  integer  data  whose  sampling 
rate  is  r.  [The  temporary  conversion  of  the  data  set 
x(n ),  n  —  1, . . .  ,Nd  to  positive  integers  for  the  sake  of 
the  Lyapunov  calculation  should  not  be  a  difficult 
matter.]  The  code  reads  the  data  set  and  embeds  it  in  a 
d-dimensional  space  in  the  manner  specified  in  Sec.  II. 
The  result  is  a  set  of  N=ND  .data  vectors 

y(n)=(x(n),x(n  +  l) . x[n+d  —  1))  where  we  have 

normalized  r  to  unity  [cf.  Eq.  (5)].  It  then  chooses  an  ini¬ 
tial  y(n )  and  finds  all  neighbors  of  y(n )  within  a  radius  r. 
These  points,  as  well  as  their  forward  images,  are  used  to 
construct  a  linear  mapping  T  from  time  n  to  time  n  + 1. 
The  Lyapunov  exponents  are  related  to  the  eigenvalues  of 
the  successive  iterates  of  the  map  T.  For  a  detailed  dis¬ 
cussion  of  the  algorithm  we  direct  the  reader  to  Ref.  38. 

The  Eckmann  et  al.  method  assumes  that  the  embed¬ 
ding  dimension  d  is  related  to  the  number  of  Lyapunov 
exponents  via  the  rule  d—(dm  —  \)M  +  \,  where  dm  is  the 
number  of  Lyapunov  exponents  and  M  is  a  positive  in¬ 
teger.  By  allowing  dm  and  M  to  range  over  various 
values  a  wide  range  of  embedding  dimensions  is  used. 
We  remark  that  the  reader  will  recall  that  in  Sec.  II  we 
established  a  method  for  determining  the  minimum 
embedding  dimension  d.  The  data  vectors  y (n)  are  as¬ 
sumed  to  live  on  some  attractor  that  occupies  some  por¬ 
tion  of  Rd.  It  is  a  numerically  difficult  exercise  to  calcu¬ 
late  Lyapunov  exponents  from  data.  Thus  it  is  necessary 
to  examine  a  wide  range  of  possible  embedding  dimen¬ 
sions  d.  It  is  our  experience  that  the  calculated  values  of 
the  exponents  converge  onto  their  correct  values  as  d  is 
increased  above  the  number  specified  by  methods  in  Sec. 
II.  We  report  numerical  experiments  for  dm  in  the  range 
between  2  and  9  for  M  =  1,2.  (We  remark  that  M  =  1  re¬ 
covers  d=dm,  while  Af=2  is  slightly  below  the  Takens 
and  Mane  limit.6,7)  In  all  of  our  tests  we  iterated  the 
tangent  map  T  2000  times  before  evaluating  the 
Lyapunov  exponent. 

To  get  a  feel  for  the  proper  densities  of  points  on  the 
reconstructed  attractor,  it  is  useful  to  use  diagnostics 
such  »?,  say,  a  histogram  cf  the  number  of  neighbors  fai¬ 
ling  within  a  range  around  the  smallest  nearest-neighbor 
distance  on  the  attractor.  If  the  density  of  points  on  an 
attractor  is  quite  inhomogeneous,  much  higher  mean 
point  densities  are  often  necessary  to  insure- that  most 
points  have  at  least  a  few  nearby  neighbors.  Often  a  use¬ 
ful  diagnostic  is  simply  to  plot  out  the  reconstructed  at¬ 
tractor,  and  visually  obtain  an  intuitive  feel  for  how 
homogeneous  the  point  density  is.  As  a  general  rule  of 
thumb  (inspired  by  Wolf  et  al.),  we  find  empirically  that 
the  minimum  number  of  points  required  for  the  predic¬ 
tion  algorithm  to  go  as  something  like  20^,  where  d  is  the 


dimension  of  the  embedding  space,  although  this  is  probfj 
ably  an  overestimate  when  d  is  4  or  more.  ^ 

The  first  dynamical  system  for  which  we  present  re¬ 
sults  is  the  Henon  map  of  the  plane  given  by  Eqs.  (7).  We“ 
used  a  data  set  with  N=  1000D  entries.  The  results  are- 
shown  in  Fig.  5  and  Table  L  As  one  can  see,  the  numeri-^ 
cal  experiments  accurately  predict  the  accepted  value  of 
the  positive  Lyapunov  exponent  A]  =0.418.  Although  t 
for  ''  ;  M  =  1  case  the  computer  code  produced  a  reason-  1 
abl  -curate  prediction  of  the  negative  Lyapunov  ei-  -i 
ponent,  the  code,  in  principle,  will  not  yield  accurate 
values  of  the  negative  or  zero  Lyapunov  exponents.  This  • 
fact  is  bom  out  in  the  M =2  case  (which  is  not  shown). 
Furthermore,  we  know  of  no  method  that  will  produce 
negative  Lyapunov  exponents  from  an  experimental  data 
set.  Since  we  are  unable  to  reliably  determine  the  nega¬ 
tive  Lyapunov  exponents  from  the  data,  we  will  not  con¬ 
strain  F(y,a)  to  reproduce  the  negative  values  of  the 
spectra. 

It  should  not  be  surprising  that  we  are  unable  to  deter¬ 
mine  the  negative  Lyapunov  spectra  using  our  data  sets. 
We  have  assumed  that  the  data  describe  motion  on  an  at¬ 
tractor.  The  negative  Lyapunov  exponents  indicate  how 
points  in  the  phase  space  that  are  off  the  attractor  get 
onto  the  attractor.  The  portion  of  the  data  set  that  might 


reveal  how  points  off  the  attractor  get  to  the  attractor  is 
the  initial  transient.  This  transient  is  typically  very  short 
(sometimes  as  few  as  10  time  steps  r)  and  is  usually  dis¬ 
carded  or  otherwise  unavailable. 

A  relate^  issue  to  be  addressed  is  that  the  code  pro¬ 
duces  dm  exponents  regardless  of  the  actual  number  of 
Lyapunov  exponents  that  govern  the  dynamics  of  the 
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FIG.  5.  The  results  of  calculating  Lyapunov  exponents  by 
the  Eckmann  et  al.  method  for  Henon  data.  The  horizontal 
axis  is  dm,  the  assumed  dimension  of  the  dynamical  system  that 
produced  the  data  set.  Thus  the  method  will  produce  dm 
Lyapunov  exponents.  The  vertical  axis  contains  the  numerical 
values  calculated  for  the  d„  different  A’s.  The  two  horizontal 
lines  are  the  known  correct  values  for  A|  =0.418  and  A=  — 1.62. 
The  method  relates  d„  to  the  embedding  dimension  d  via 
d=(dm  —  \)M+\.  This  figure  shows  results  for  M  =  1.  Spuri¬ 
ous  exponents  are  labeled  with  squares  while  dynamical  ex¬ 
ponents  are  labeled  with  X's. 
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TABLE  L  Lyapunov  exponent!  for  the  Henon  attractor  M  =  1.  and  the  number  of  data  points  is  10000. 


dm 

X 

2 

A.,  =0.412 

Xj=  — 1.70 

3 

A.,  =0.412 

Xj= -0.662' 

Xj=  — 1.72 

4 

X, =0.408  j 

Xj=— 0.281 

■  X,= -0.655 

'  X,=  — 1.88 

5 

X, =0.408  7 

Xj= -0.0824 

Xj =0.305 

X,= -0.622 

X3=  — 1.55 

6 

X,  =0.407  -  ' 

Xj=0. 128 

X3= -0.144 

X<= -0.321 

Xj= -0.581 

X«=  — 1.56  1 

7 

X, =0.437 

Xj=0.323 

X,= -0.0767 

X,= -0.190 

X,= -0.335 

X*=  -0.604  ' 

X,=  — 1.54 

- 

8 

X,  =0.602 

Xj=0.382 

X3= -0.0509 

X4=— 0.118 

Xj=  -0.203 

X<=  -0.332 

X,= -0.642 

X,=  — 1.54 

9 

X,  =0.677 

Xj  =0.377 

Xj=0.0896 

X,=  -0.0390 

Xj= -0.124 

X*=  -0.203 

Xr= -0.324 

X,= -0.652 

X,=  -1.58 

Accepted  values  of  X 

=0.418 

- 

-1.62 

physical  system  in  question.  However,  it  is  relatively 
easy  to  determine  which  of  the  dm  exponents  govern  the 
dynamics  of  the  system  and  which  are  spurious.  We  as¬ 
sume  that  one  has  successfully  determined  the  minimum 
embedding  dimension  of  the  attractor  by  the  method  we 
presented  in  Sec.  II  (or  any  other  reliable  method  at  the 
reader’s  disposal).  Examination  of  Fig.  5  indicates  that 
most  of  the  spurious  exponents  are  negative.  These  nega¬ 
tive  exponents  are  necessary  to  contract  the  d- 
dimensional  phase  space  onto  the  attractor  whose  dimen¬ 
sion  is  dA<d.  The  one  positive  spurious  exponent  ap¬ 
pears  at  dm  =7  for  the  M  —  1  case.  We  know  from  Fig.  1 
that  the  dynamics  of  the  Henon  attractor  can  be  embed¬ 
ded  in  two  dimensions.  Hence  we  conclude  that  an  ex¬ 
ponent  that  exist  only  for  dm>l  must  be  spurious.  The 
origin  of  this  spurious  positive  exponent  is  discussed  by 
Eckmann  et  al.  It  is  believed  it  will  stabilize  at  a  vaier 
of  2A.|. 

We  have  averaged  the  calculated  values  of  Xj  for  the 
M  =  1  case  over  the  range  dm=  2-6.  We  discarded  the 
values  of  A.,  for  dm>  7  since  they  have  obviously  been  al¬ 
tered  by  the  spurious  Lyapunov  exponent  generated  at 
dm~ 7.  We  find  that  the  average  value  is  X|  =0.409, 
which  differs  from  the  accepted  value  of  0.418  by  only 
2%.  For  the  M  =  2  case  we  found  similar  results.  After 
averaging  we  find  that  X,  =0.420.  In  conclusion,  we  state 
that  by  comparing  the  M  =  1  and  2  cases  we  feel  that  the 
code  successfully  determined  the  positive  Lyapunov  ex¬ 
ponent  associated  with  the  Henon  attractor. 

We  now  turn  our  attention  to  the  second  dynamical 
system  we  wish  to  analyze,  the  Lorenz  system  of  ODE’s 
given  by  Eqs.  (8).  The  data  set  used  for  our  numerical  ex¬ 
periments  consisted  of  A  =  20  000  entries  and  was  gen¬ 
erated  by  integrating  Eqs.  (8)  forward  in  time  using  a  sim¬ 
ple  fourth-order  Rungc-Kutta  routine  with  a  fixed  time 
step.  We  chose  to  record  the  xt(r)  variable,  although  ei¬ 
ther  the  x2(t )  or  x2(t )  variable  would  do  as  well.  Figure 
6  is  a  graph  of  the  autocorrelation  function.  The  first 
minimum  is  at  n  ~  12  where  n  is  the  number  of  Rungc- 
Kutta  time  steps  of  length  0.03.  The  time  associated  with 
this  first  minimum  is  approximately  fc~0.36.  We  use  a 


sampling  rate  r=0.03,  which  is  approximately  -X  of  the 
autocorrelation  time.  Thus  we  use  every  Runge-Kutta 
data  point  as  our  experimental  data  set.  We  allowed  dm 
to  range  between  2  and  9  for  M  =  1  and  2.  The  results  of 
our  numerical  experiments  are  shown  in  Fig.  7  and  Table 
II. 

For  all  cases  M  =  1  and  2,  we  are  able  to  accurately 
determine  both  the  positive  and  the  zero  Lyapunov  ex¬ 
ponent.  The  accepted  value  of  A.,  is  1.50.  The  average  of 
the  calculated  values  of  A.,  for  dm  >  5  in  the  M  =  1  case  is 
X,  =  1.45,  which  is  an  error  of  only  3%.  As  with  the 
Henon  example,  we  found  better  results  for  the  M—2 
case. 

The  question  of  a  zero  Lyapunov  exponent  requires 
special  consideration.  Any  dynamical  system  that  is 
represented  by  an  ODE  will  contain  a  zero  Lyapunov  ex¬ 
ponent.  As  can  be  seen  from  Fig.  7  and  Table  II,  one  of 
the  Lyapunov  exponents  calculated  from  the  experimen¬ 
tal  acta  set  is  very  small  (as  much  as  two  orders  of  magni¬ 
tude)  compared  to  A.j.  We  also  notice  that  this  exponent 
is  very  stable  and  very  persistent.  It  exists  for  M  =  1  and 
2  over  the  entne  range  of  dm.  Given  this  behavior  and 


FIG.  6.  Autocorrelation  function  for  x,  (/),  the  Lorenz  sys¬ 
tem  from  20000  data  points. 
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FIG.  7.  The  results  of  applying  the  Eckmann  et  a!,  method 
of  calculating  Lyapunov  exponents  to  a  Lorenz  data  set.  The 
results  shown  are  for  M  =  1.  For  this  dynamical  system  there 
are  three  dynamical  exponents  at  X|  =  1.50,  Xj=0.0,  and 
Xj  =  —22.5. 


the  prevalence  of  ODE’s  as  dynamical  systems,  we  feel 
confident  in  predicting  a  zero  Lyapunov  exponent. 

Of  course,  we  have  the  luxury  here  of  knowing  that  our 
data  set  came  from  an  ODE.  This  type  of  knowledge 
concerning  the  origin  of  a  data  set  is  typically  unavail¬ 
able.  Thus  we  must  use  our  best  judgment  and  live  with 
the  fact  that  we  cannot  know  for  certain  whether  a 
Lyapunov  exponent  generated  by  the  Eckmann  et  al. 
method  should  be  interpreted  as  zero  or  just  very  small. 
Our  recommendation  is  that  one  compare  the  suspected 
zero  exponent  to  the  smallest  nonspurious  positive 
Lyapunov  exponent  generated  by  the  code.  If  the 
suspected  exponent  is  as  persistent,  as  stable,  and  more 
than  a  factor  of  25  smaller  than  the  smallest  positive  ex¬ 
ponent,  we  recommend  that  the  suspected  exponent  be 
assigned  the  value  zero. 


B.  Wolf-Swift-SwiaMjr-VaftiM  method 

A  second  technique  that  we  have  investigated  to  deter-, 
mine  Lyapunov  exponents  from  time  series  is  due  toi 
Wolf,  Swift,  Swinney,  and  Vastano  (WSSV).39  This  paper 
presents  two  algorithms,  one  for  determining  the  fuH! 
Lyapunov  spectrum  from  a  known  set  of  dynamical  equa-^ 
tions,  and  one  for  determining  only  the  largest  positive^ 
exponent  if  one  has  available  only  a  time  series  from  theH 
dynamical  system.  Since  the  paper  includes  the  source  1 
codes  for  the  two  algorithms,  we  copied  and  used  them"* 
directly.  The  WSSV  code  for  time  series  analysis  can 
only  determine  the  largest  positive  exponent.  We  have 
up  to  now  chosen  to  use  only  one  Lyapunov  exponent  as 
a  constraint  to  the  nonlinear  fitting  method,  and  so  this 
program  proves  sufficient  for  our  needs.  Given  the  ■. 
current  difficulty  of  accurately  determining  other  ex¬ 
ponents  from  a  time  series  of  data,  we  restrict  the  con¬ 
straints  to  one  Lyapunov  exponent.  In  addition  to  these 
considerations,  the  WSSV  code  is  exceptionally  easy  to 
use,  and  requires  relatively  minimal  amounts  of  data. 

The  WSSV  code  for  time  series  works  in  a  manner 
somewhat  similar  to  other  techniques  which  attempt  to 
approximate  in  some  way  the  local  tangent  space  about  a 
fiducial  orbit.  In  this  case,  the  code  initially  constructs 
the  time-delay  reconstructed  coordinates  for  the  system 
in  the  usual  manner,  taking  the  parameters  for  the  recon¬ 
struction  as  input  to  the  program.  The  calculation  of  the 
Lyapunov  exponent  then  begins  by  finding  the  nearest 
neighbor  in  the  reconstructed  phase  space  to  the  first 
point  of  the  orbit,  where  “nearest"  is  measured  using  the 
usual  Euclidean  metric.  Once  this  point  is  found,  the 
magnitude  of  the  difference  vector  between  the  two 
points  is  recorded.  The  algorithm  then  proceeds  by 
evolving  the  fiducial  point  along  its  trajectory,  and  the 
neighboring  point  along  its  trajectory,  a  given  number  of 
steps  of  the  time  series.  The  magnitude  of  the  final  sepa¬ 
ration  between  the  two  points  is  then  determined,  and  the 
contribution  to  the  Lyapunov  exponent  is  then  simply 
given  as  the  logarithm  of  the  final  separation  divided  by 
the  initial  separation,  divided  by  the  time  interval  of  evo- 


TABLE  II.  Lyapunov  exponents  for  the  Lorenz  attractor  M  —  1,  and  the  number  of  data  points  is 
20000. 
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FIG.  7.  The  results  of  applying  the  Ecktnann  et  al.  method 
of  calculating  Lyapunov  exponents  to  a  Lorenz  data  set.  The 
results  shown  are  for  M  =  1.  For  this  dynamical  system  there 
are  three  dynamical  exponents  at  A.,  =  1 . 50,  A =0.0,  and 
Aj=— 22.5. 


the  prevalence  of  ODE’s  as  dynamical  systems,  we  feel 
confident  in  predicting  a  zero  Lyapunov  exponent. 

Of  course,  we  have  the  luxury  here  of  knowing  that  our 
data  set  came  from  an  ODE.  This  type  of  knowledge 
concerning  the  origin  of  a  data  set  is  typically  unavail¬ 
able.  Thus  we  must  use  our  best  judgment  and  live  with 
the  fact  that  we  cannot  know  for  certain  whether  a 
Lyapunov  exponent  generated  by  the  Eckmann  et  al 
method  should  be  interpreted  as  zero  or  just  very  small. 
Our  recommendation  is  that  one  compare  the  suspected 
zero  exponent  to  the  smallest  nonspurious  positive 
Lyapunov  exponent  generated  by  the  code.  If  the 
suspected  exponent  is  as  persistent,  as  stable,  and  more 
than  a  factor  of  25  smaller  than  the  smallest  positive  ex¬ 
ponent,  we  recommend  that  the  suspected -exponent  be 
assigned  the  value  zero. 


B.  Wolf-Swift-Swiaaey-Vaataito  Method  ,1 

A  second  technique  that  we  have  investigated  to  deter-) 
mine  Lyapunov  exponents  from  time  series  is  due  toi 
Wolf,  Swift,  Swinney,  and  Vastano  (WSSV).39  This  paper 
presents  two  algorithms,  one  for  determining  the  fuHi 
Lyapunov  spectrum  from  a  known  set  of  dynamical  eqtuum 
tions,  and  one  for  determining  only  the  largest  positive^ 
exponent  if  one  has  available  only  a  time  series  from  the'1 
dynamical  system.  Since  the  paper  includes  the  source  ' 
codes  for  the  two  algorithms,  we  copied  and  used  them* 
directly.  The  WSSV  code  for  time  series  analysis  can 
only  determine  the  largest  positive  exponent.  We  have 
up  to  now  chosen  to  use  only  one  Lyapunov  exponent  as 
a  constraint  to  the  nonlinear  fitting  method,  and  so  this 
program  proves  sufficient  for  our  needs.  Given  the  ■ 
current  difficulty  of  accurately  determining  other  ex¬ 
ponents  from  a  time  series  of  data,  we  restrict  the  con¬ 
straints  to  one  Lyapunov  exponent.  In  addition  to  these 
considerations,  the  WSSV  code  is  exceptionally  easy  to 
use,  and  requires  relatively  minimal  amounts  of  data. 

The  WSSV  code  for  time  series  works  in  a  manner 
somewhat  similar  to  other  techniques  which  attempt  to 
approximate  in  some  way  the  local  tangent  space  about  a 
fiducial  orbit.  In  this  case,  the  code  initially  constructs 
the  time-delay  reconstructed  coordinates  for  the  system 
in  the  usual  manner,  taking  the  parameters  for  the  recon¬ 
struction  as  input  to  the  program.  The  calculation  of  the 
Lyapunov  exponent  then  begins  by  finding  the  nearest 
neighbor  in  the  reconstructed  phase  space  to  the  first 
point  of.the  orbit,  where  “nearest"  is  measured  using  the 
usual  Euclidean  metric.  Once  this  point  is  found,  the 
magnitude  of  the  difference  vector  between  the  two 
points  is  recorded.  The  algorithm  then  proceeds  by 
evolving  the  fiducial  point  along  its  trajectory,  and  the 
neighboring  point  along  its  trajectory,  a  given  number  of 
steps  of  the  time  series.  The  magnitude  of  the  final  sepa¬ 
ration  between  the  two  points  is  then  determined,  and  the 
contribution  to  the  Lyapunov  exponent  is  then  simply 
given  as  the  logarithm  of  the  final  separation  divided  by 
the  initial  separation,  divided  by  the  time  interval  of  evo- 
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FIG.  7.  The  results  of  applying  the  Eckmann  et  al.  method 
of  calculating  Lyapunov  exponents  to  a  Lorenz  data  set.  The 
Jesuits  shown  are  for  A/  =  1.  For  this  dynamical  system  there 
are  three  '  exponents  at  X|  =  l,50,  Xj=0.0,  and 
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A  second  technique  that  we  have  investigated  to  deters 
mine  Lyapunov  exponents  from  time  series  is  due  toi 
Wolf,  Swift,  Swinney,  and  Vastano  (WSSV).39  This  paper 
presents  two  algorithms,  one  for  determining  the  fuff! 
Lyapunov  spectrum  from  a  known  set  of  dynamical  eqtut^ 
tions,  and  one  for  determining  only  the  largest  positive^ 
exponent  if  one  has  available  only  a  time  series  from  theH 
dynamical  system.  Since  the  paper  includes  the  source  ' 
codes  for  the  two  algorithms,  we  copied  and  used  them-* 
directly.  The  WSSV  code  for  time  series  analysis  can 
only  determine  the  largest  positive  exponent.  We  have 
up  to  now  chosen  to  use  only  one  Lyapunov  exponent  as 
a  constraint  to  the  nonlinear  fitting  method,  and  so  this 
program  proves  sufficient  for  our  needs.  Given  the  ■. 
current  difficulty  of  accurately  determining  other  ex¬ 
ponents  from  a  time  series  of  data,  we  restrict  the  con¬ 
straints  to  one  Lyapunov  exponent.  In  addition  to  t  jese 
considerations,  the  WSSV  code  is  exceptionally  eas}  to 
use,  and  requires  relatively  minimal  amounts  of  data. 

The  WSSV  code  for  time  series  works  in  a  manner 
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concerning  the  origin  of  a  data  set  is  typically  unavail¬ 
able.  Thus  we  must  use  our  best  judgment  and  live  with 
the  fact  that  we  cannot  know  for  certain  whether  a 
Lyapunov  exponent  generated  by  the  Eckmann  et  al. 
method  should  be  interpreted  as  zero  or  just  very  small. 
Our  recommendation  is  that  one  compare  the  suspected 
zero  exponent  to  the  smallest  nonspurious  positive 
Lyapunov  exponent  generated  by  the  code.  If  the 
suspected  exponent  is  as  persistent,  as  stable,  and  more 
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ponent,  we  recommend  that  the  suspected  exponent  be 
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somewhat  similar  to  other  techniques  which  attempt  to 
approximate  in  some  way  the  local  tangent  space  about  a 
fiducial  orbit.  In  this  :ase,  the  code  initially  constructs 
the  time-delay  reconstructed  coordinates  for  the  system 
in  the  usual  manner,  taking  the  parameters  for  the  recon¬ 
struction  as  input  to  the  program.  The  calculation  of  the 
Lyapunov  exponent  then  begins  by  finding  the  nearest 
neighbor  in  the  reconstructed  phase  space  to  the  first 
point  of-the  orbit,  when  “nearest”  is  measured  using  the 
usual  Euclidean  metric.  Once  this  point  is  found,  the 
magnitude  of  the  difference  vector  between  the  two 
points  is  recorded.  The  algorithm  then  proceeds  by 
evolving  the  fiducial  point  along  its  trajectory,  and  the 
neighboring  point  along  its  trajectory,  a  given  number  of 
steps  of  the  time  series.  The  magnitude  of  the  final  sepa¬ 
ration  between  the  two  points  is  then  determined,  and  the 
contribution  to  the  Lyapunov  exponent  is  then  simply  •  * 
given  as  the  logarithm  of  the  final  separation  divided  by 
the  initial  separation,  divided  by  the  time  interval  of  evo- 
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ludon.-  These  contributions  are  then  averaged  over  the 
length  of  the  time  series 

This  simple  scheme  works  to  provide  the  largest 
Lyapunov  exponent  because,  given  arbitrary  initial  condi¬ 
tions  for  the  two  neighbors  and  an  appropriately  long 
evolution  time,  the  exponential  growth  due  to  the  largest 
positive  exponent  dominates  the  overall  behavior  of  the 
difference  vectors,  so  that  to  good  accuracy  the  net 
change  in  the  magnitude  of  the  two  vector*  reflects  al¬ 
most  solely  this  rate  of  growth*  Note  that  a  technical 
problem  exists  here,  in  that  the  Lyapunov  exponent  is 
defined  only  in  terms  of  the  linearized  equations  of 
motion  about  the  fiducial  trajectory,  and  the  exponential 
divergence  of  neighboring  trajectories  can  rapidly  drive 
them  out  of  the  linear  regime.  In  the  WSSV  code,  this 
problem  is  addressed  in  a  straightforward  manner;  i.e., 
when  the  distance  between  the  neighbors  becomes  too 
large,  the  algorithm  abandons  this  point  and  searches  for 
a  new  neighbor.  A  suitable  new  neighbor  is  chosen  on 
the  basis  of  two  criteria:  first,  the  point  must  again  be  as 
close  to  the  fiducial  trajectory  point  as  possible,  and 
second,  the  orientation  of  the  abandoned  difference  vec¬ 
tor  must  be  preserved  as  nearly  as  possible.  The  process 
of  choosing  a  new  neighbor  using  these  criteria  is  thus 
approximately  equivalent  to  rescaling  the  difference  vec¬ 
tor  to  a  much  smaller  size.  In  practical  terms  there  »s  a 
t  •ade-off  between  choosing  points  which  are  very  close  to 
tue  fiducial  point  and  points  whose  difference  vectors  lie 
nearly  along  the  ray  defined  by  the  abandoned  difference 
vector.  This  pioblem  is  handled  internally  in  the  code  by 
a  multistep  search  algorithm.  Once  a  suitable  new  neigh¬ 
bor  is  determined,  the  new  difference  vector  is  then 
evolved  until  it  too  becomes  too  large,  and  then  the  pro¬ 
cess  is  repeated. 

Because  this  numerical  procedure  is  relatively  straight¬ 
forward,  there  are  actually  few  variables  necessary  as  in¬ 
put  to  the  algorithm,  and  hence  the  program  is  much 
easier  to  us  -  '.han  other  Lyapunov  exponent  algorithms. 
There  are  seven  basic  variables  which  must  be  set  to  per¬ 
form  an  analysis  off*  .  ata  set,  :oost  of  which  are  deter¬ 
mined  when  one  calculates  the  embedding  dimension  as 
in  Sec.  II.  The  first  four  variables,  which  are  related  to 
the  time-delay  reconstruction,  are  the  number  of  points 
in  the  data  set  (AO,  the  embedding  dimension  d,  recon¬ 
struction  time  delay  rd,  and  the  sampling  rate  for  the 
data  rt.  The  first  of  these  variables  N  is  usually  fixed 
when  an  experimental  time  series  is  being  analyzed,  al¬ 
though  some  criterion  for  the  minimum  number  of  points 
necessary  for  a  good  estimate  of  the  Lyapunov  exponent 
can  be  given.  Wolf,  Swift,  Swinney,  and  Vastano  give  a 
genera!  rule  for  the  minimum  number  of  data  points  as  at 
least  10‘/,  although  this  value  can  depend  on  the  topology 
of  the  attractor  and  the  relative  magnitudes  of  the 
Lyapunov  exponents.  Our  experience  has  shown  that  at 
least  twice  this  number  of  points  is  usually  necessary  for 
tv.'o  significant  figures  of  accuracy,  and  greater  accu.acy 
can  require  much  longer  time  series.  It  should  be  noted 
that  in  terms  of  the  algorithm,  longer  time  series  are  re¬ 
quired  not  just  to  improve  the  convergence  by  providing 
more  contributions  to  the  Lyapunov  value;  longer  time 
se,  'ri  so  provide  a  higher  density  of  points  on  the  at¬ 


tractor  iq  hence  there  are  more  nearby  neighbors  to 
choose  from  when  replacements  are  necessary,  making 
this  process  more  accurate. 

The  embedding  dimension  parameter  d  is  the  dimen¬ 
sion  of  the  time-delay  reconstructed  vectors  y (n ),  and  is 
determined  as  in  Sec.  II.  As  discussed  there,  the  dimen¬ 
sion  of  the  embedding  space  must  be  sufficiently  large  to 
ensure  the*  none  of  the  dynamical  information  about  the 
attractor  k  lost;  however,  needlessly  large  values  of  the 
embedding  dimension  results  in  greatly  increased  compu¬ 
tation  time  for  the  Lyapunov  calculation  and  also  in¬ 
creased  sensitivity  to  noise.  For  the  example  systems  that 
we  have  investigated  using  this  method,  we  have  chosen 
the  embedding  dimension  to  be  the  next  highest  integer 
dimension  to  the  (known)  fractal  dimension,  although  for 
experimental  data,  where  one  is  not  sure  of  the  fractal  di¬ 
mension,  one  may  often  feel  safer  to  choose  a  larger 
value. 

The  second  variable  that  is  necessary  for  the  time-delay 
reconstruction  in  the  program  is  the  actual  time  delay 
value  rd.  This  variable,  as  discussed  in  Sec.  II,  gives  thev*- 
time  separation  of  the  components  of  the  d  vectors  in 
terms  of  the  number  of  iterates  of  the  time  series,  and  can 
be  thought  of  as  being  chosen  to  make  the  d  components 
as  “orthogonal”  as  possible.  For  dynamical  time  series 
derived  from  a  mapping,  as  for  the  Henon  system,  this 
value  can  be  chosen  to  be  1,  since  each  iterate  generally 
represents  one  entire  “orbit"  on  the  attractor  of  the  flow 
that  the  mapping  is  derived  from.  For  continuous 
phase-space  flows,  as  for  the  Lorenz  system,  one  can 
often  use  the  rule  of  thumb  given  by  drrf  =  l,  where  d  is 
the  embedding  dimension  and  rd  is  here  given  as  the  frac¬ 
tion  of  the  orbital  period  on  the  attractor,  which  must 
then  be  expressed  in  time  series  steps.  Another  more  so¬ 
phisticated  method  is  to  take  rd  as  roughly  the  first  zero 
of  the  autocorrelation  function  for  the  time  series.  The 
choice  of  method  for  determining  the  time  delay  is  not 
crucial,  however,  since  the  reconstructed  dynamics  is 
generally  not  strongly  dependent  on  the  exact  value  as 
long  as  it  is  within  a  reasonable  range  of  the  correct 
value. 

The  fourth  variable  Ts  is  the  time  between  successive 
measurements  in  the  time  series,  or  rather  the  inverse  of 
the  sampling  rate.  This  value  is  not  actually  a  variable, 
but  rather  an  additional  piece  of  information  that  must 
be  supplied  with  any  time  series,  and  is  used  in  the  algo¬ 
rithm  to  rescale  the  Lyapunov  exponents  by  setting  the 
time  scale  for  the  rate  of  divergence  of  the  trajectories. 
Although  one  may  have  no  control  over  the  sampling 
rate  for  an  arbitrary  set,  for  systems  where  one  does  have 
control  this  parameter  is  an  important  issue,  and  can 
greatly  affect  the  qurlity  of  data.  Many  of  the  aspects  of 
problems  that  can  arise  are  from  improper  sampling  rates 
are  discussed  by  Mayer-Kress.*0 

Two  of  the  input  variables  to  the  algorithm  are  con¬ 
cerned  with  setting  length  scales  for  the  reconstructed 
dynamics.  The  parameter  Sm„  controls  the  maximum 
distance  that  the  algorithm  will  look  for  neighbors  when 
it  attempts  replacement.  Since  we  take  a  rough  value  for 
the  limit  of  the  validity  of  the  linear  approximation  to  be 
about  1%  of  the  macroscale  of  the  attractor,  the  value  of 
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Sm„  should  be  taken  at  somewhat  less  than  this  value. 
Of  course,  making  this  parameter  smaller  will  increase 
accuracy;  however,  the  density  of  points  on  the  attractor 
determines  how  small  it  can  be,  and  making  5^,  too 
small  also  has  the  unfortunate  effect  of  greatly  increasing 
the  computation  time  of  the  algorithm.  It  is  instructive 
to  do  some  experimentation  with  the  effect  of  this  vari¬ 
able  when  analyzing  a  data  set,  however,  we  have  found 
that  the  1%  rule  is  usually  a  good  guess.  The  second 
scale  variable  is  SmiB,  which  sets  the  minimum  distance 
that  the  algorithm  will  look  for  neighbors  during  replace¬ 
ment.  The  purpose  of  this  parameter  is  to  reduce  the 
effects  of  very  small  levels  of  noise  by  eliminating  the 
choice  of  neighbors  which  are  so  close  that  they  are 
within  the  scale  of  distance  that  the  noise  defines.  Since 
we  deal  with  “clean”  data  sets  throughout  the  discussion 
in  this  paper,  the  value  of  n  was  set  quite  low.  For  ac¬ 
tual  experimental  data  corrupted  by  noise,  a  good  deal  of 
experimentation  with  this  variable  is  probably  necessary, 
as  it  is  difficult  to  estimate  the  effective  scale  that  the 
noise  will  appear  on.  It  should  be  noted  that  this  param¬ 
eter  is  only  effective  at  reducing  the  effects  of  very  small 
magnitudes  of  noise,  as  we  have  found  that  Smn  can  usu¬ 
ally  be  not  much  larger  than  about  1%  of  5^,  or  the  al¬ 
gorithm  has  difficulty  finding  sufficient  numbers  of  neigh¬ 
bors  within  the  linear  regime  for  replacement. 

The  last  input  parameter  to  the  algorithm  is  TE,  which 
gives  the  evolution  time  (in  time  series  steps)  that  a  given 
pair  of  neighbors  are  allowed  to  evolve  before  replace¬ 
ment.  The  value  of  this  variable  can  greatly  effect  the  ac¬ 
curacy  of  the  calculation  of  the  Lyapunov  exponent,  for  a 
number  of  reasons.  If  the  evolution  time  is  too  short,  the 
difference  vector  between  the  two  neighboring  trajec¬ 
tories  may  not  have  sufficient  time  to  evolve  with  the  dy¬ 
namics  on  the  attractor,  and  the  frequent  replacement 
process  can  introduce  considerable  inaccuracies.  If  the 
evolution  time  is  too  long,  the  neighboring  points  can 
often  evolve  to  distances  which  are  greater  than  the 
linearized  regime,  and  so  these  contributions  are  also 
inaccurate.  Additionally,  for  attractors  which  may  have 
a  multilobed  structure,  such  as  the  Lorenz  attractor, 
enormous  errors  can  be  introduced  if  the  evolution  time 
is  sufficiently  long  to  allow  two  neighboring  points  to 
eventually  evolve  along  the  two  separate  lobes. 

To  choose  TE  for  a  time  series  produced  by  a  map,  one 
or  two  iterations  of  the  map  is  usually  a  good  value,  as 
was  the  case  for  the  Henon  system.  For  a  flow,  some  ex¬ 
perimentation  must  be  done.  A  good  general  mle  is  that 
the  evolution  time  for  a  flow  should  be  on  the  order  of  y 
to  ly  orbital  periods  on  the  attractor,  although  this  again 
can  depend  on  the  magnitude  of  the  Lyapunov  ex¬ 
ponents.  When  one  only  has  a  time  series  to  work  with, 
an  orbital  period  for  the  system  can  be  determined  by 
taking  a  power  spectrum  of  the  time  series  and  picking 
the  dominant  feature,  if  any.  Once  a  rough  estimate  of 
what  the  evolution  time  should  be  is  obtained,  it  is 
strongly  advised  to  calculate  the  Lyapunov  value  for  a 
range  of  evolution  times  around  the  rough  value.  The 
computed  values  of  the  Lyapunov  exponent  versus  the 
evolution  time  will  usually  remain  flat  for  some  range  of 


the  evolution  times,  and  a  value  within  this  stable  : 
is  usually  an  accurate  choice. 

Using  the  above  general  guidelines,  the  Wolf  code  watij 
used  to  determine  the  Lyapunov  exponents  of  two  sample  \ 
systems  for  which  the  exponents  are  already  known:  the  > 
Henon  map  and  the  Lorenz  system.  In  both  cases,  all  of i 
the  parameters  could  be  chosen  ahead  of  time  with  good  • 
confidence,  with  the  exception  of  the  evolution  time  (Ts).' 
For  this  parameter,  a  series  of  runs  with  differing  Tf 
values  were  done,  as  a  check  of  the  stability  of  the 
Lyapunov  value  with  different  evolution  times,  and  to 
demonstrate  how  this  may  be  done  with  other  parameters 
for  which  good  guesses  are  not  available  a  priori. 

For  the  Henon  map  (whose  dimension  is  known  to  be 
1.26),  we  chose  d=2,  and  W=2000,  although  about  1000 
(=30^)  would  probably  suffice.  Since  the  system  is 
defined  by  a  mapping,  we  choose  rd  —  1  (this  is  verified 
using  the  autocorrelation  calculation,  Fig.  8),  and  like¬ 
wise  tc  —  1.  Since  the  largest  scale  of  the  map  is  about  4, 
we  chose  to  be  0.25  to  0.05.  Also,  since  the  data  are 
generated  numerically,  the  only  noise  is  from  machine  er¬ 
ror,  so  we  chose  5min  to  be  a  conservative  10-5.  Note 
that  some  experimentation  was  conducted  with  these 
values,  but  that  the  result  of  the  calculation  showed 
was  not  greatly  affected  for  parameter  values  within 
reasonable  limits  of  the  ones  given,  although  the  run 
times  could  be  considerably  affected  for  too  small. 
For  the  remaining  parameter  TE  we  present  a  graph  of 
the  value  of  the  largest  exponent  A.  versus  the  value  of  TE 
(Fig.  9).,  Note  that  there  is  a  plateau  in  the  value  of  A  at 
about  0.624.  for  values  of  the  parameter  TE  out  to  about 
5,  after  which  it  drops  off  sharply.  Note  that  even 
though  the  characteristic  time  for  his  map  is  1,  we  see 
that  A |  is  stable  to  a  reasonably  large  variation  in  TE. 
The  value  we  obtain  for  Aj  is  within  about  3%  of  the 
value  quoted  by  Wolf  et  al. 

For  the  second  example,  the  Lotenz  system,  a  data  set 
was  generated  by  integrating  the  dynamical  equations 
with  a  Runge-Ru'ca  integrator,  using  a  time  step  [  =  1 
(sampling  rate)]  of  about  0.03  sec.  Since  the  characteris- 
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FIG.  8.  Autocorrelation  function  for  xt(rt)  in  the  Henon  sys¬ 
tem  from  2000  data  points. 
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FIG.  9.  k  vs  Te  for  the  WSSV  method  in  the  H6non  map. 

tic  time  for  the  Lorenz  attractor  is  about  0.5  sec,  this 
gives  about  17  points  per  orbit  on  the  attractor.  The  di¬ 
mension  of  the  attractor  is  known  to  be  about  2.06,  and 
an  embedding  dimension  of  d  —  3  was  chosen.  The 
minimum  number  of  data  points  required,  as  estimated 
by  our  rule  of  thumb,  is  about  5000,  so  we  close  a  set  of 
10000  points  (=iV).  The  autocorrelation  calculation 
(Fig.  6)  suggests  a  value  of  tc  =  13,  and  rrf  is  0.03.  Since 
the  maximum  scale  of  the  Lorenz  attractor  is  about  40, 
was  chosen  to  be  about  0.4  or  0.5,  and  Smin  was 
chosen,  by  the  same  arguments  as  for  the  Henon  system, 
to  be  about  10~5.  A t  for  the  previous  example,  we  calcu¬ 
lated  the  largest  Lyapunov  exponent  for  a  range  of  the 
last  parameter  TE  and  these  results  are  shown  in  Fig.  10. 
From  the  graph,  one  notes  that  A.1  settles  into  a  some¬ 
what  flat  region  by  a  value  of  TE  of  about  16  or  so  (one 
orbital  period)  and  remains  roughly  so  until  about  30 
(two  y  orbital  periods).  There  is  still  a  considerable  varia¬ 
tion  in  the  values  of  k  along  this  region,  which  very  likely 
indicates  that  the  convergence  is  still  not  very  good  and  a 
longer  data  set  is  necessary.  The  average  value  from  this 


FIG.  10.  k  vs  fE  for  the  WSSV  method  in  the  Lorenz  sys¬ 
tem. 


regime  is  about  2.22,  which  is  within  2.7%  of  the  value 
2.16  quoted  by  Wolf  et  ai 

It  is  worth  noting  that,  because  of  the  double-lobed 
structure  of  the  Lorenz  attractor,  the  program  can  often 
be  "fooled"  by  choosing  two  nearby  initial  orbits  which 
wind  up  on  different  lobes  of  the  attractor,  thereby  giving 
erroneous  contributions  to  the  averaged  exponent.  In 
this  sense,  the  Lorenz  system  is  a  somewhat  difficult  case 
for  study  using  the  fixed-time-evolution  program,  and 
hence  the  results  can  be  somewhat  less  accurate  than 
would  be  expected. 

Calculations  of  the  largest  Lyapunov  exponent  were 
carried  out  for  other  systems  beside  the  two  examples 
above,  and  in  all  cases  the  worst  errors  were  on  the  order 
5%,  with  most  values  being  about  1-2  %  of  the  expected 
exponent.  We  conclude  that,  at  least  for  the  largest  posi¬ 
tive  exponent,  the  above  code  is  relatively  simple  to  use 
and  provides  reliable  and  reasonably  accurate  results. 
Although  we  have  not  tested  them  yet,  more  elaborate 
versions  of  the  code  promise  greater  accuracy,  as  well  as 
the  calculation  of  the  rest  of  the  positive  Lyapunov  ex-v,. 
ponents.  The  one  drawback  of  the  method  is  that  it  does 
not  allow  for  calculation  of  the  negative  exponents  of  the 
spectrum,  although  current  research  suggests  that  it  may 
be  possible  to  capture  at  least  the  largest  negative  ex¬ 
ponent  using  time  reversal  of  the  data  sequence. 

Some  experimentation  was  done  with  calculating  the 
largest  Lyapunov  exponent  for  a  few  other  systems,  and 
in  all  cases  the  worst  errors  were  on  the  order  of  4-5  %, 
with  most  values  being  within  1-2  %  of  the  expected  ex¬ 
ponent.  We  can  conclude  from  these  studies  that  the 
WSSV  code  provides  a  very  simple  and  reasonably  accu¬ 
rate  way  of  determining  the  largest  Lyapunov  e-  ponent, 
and  does  not.  require  the  excessive  amounts  of  data  that 
some  of  the  other  algorithms  seem  to  need.  For  applica¬ 
tions  where  only  the  dominant  behavior  of  the  spreading 
of  nearby  trajectories  is  needed,  and  where  it  is  not  neces¬ 
sary  to  know  the  remaining  Lyapunov  exponents,  this  al¬ 
gorithm  can  prove  very  useful. 

C.  Lyapunov  exponents  from  the  map  F(y, a) 

Whatever  method  one  chooses  to  use  for  determining 
the  Lyapunov  exponents  from  the  data,  and  we  have  ex¬ 
amined  only  two  possible  methods  proposed  in  the  litera¬ 
ture,  we  must  now  establish  a  way  to  express  these  same 
quantities  in  terms  of  our  map  F(y,a).  A  direct  tran¬ 
scription  of  the  methods  of  Shimada  and  Nagashima,15 
Benettin,  Froeschle,  and  Scheidecker,16  or  others  would 
lead  to  a  correct  prescription,  but  not  one  which  is  easily 
used  in  the  optimization  or  fitting  we  wish  to  do  using  the 
function  F(y,a).  The  point  is  that  one  can  achieve  better 
results  in  this  fitting  if  one  has  available  a  useful  analytic 
formula  for  the  derivatives  of  the  constraints  with  respect 
to  the  parameters  a.  We  will  choose  then  a  slightly 
different  way  of  expressing  the  Lyapunov  exponents  in 
terms  of  the  map  F(y,a)  than  appears  in  the  literature. 
Ours  may  be  a  useful  technique  in  itself. 

Lyapunov  exponents  characterize  the  way  in  which 
neighboring  points,  small  areas,  or  small  volumes  near 
the  orbit  of  interest  evolve  under  the  mapping.  To  find 
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them  one  linearizes  the  mapping  yin +l)'=F(y(n),a) 
iround  a  given  orbit  y(  1  ),y(2), . . . ,  ydV).  Small  devia- 
.  ^  ,tion$  from  this  orbit,  call  themfiyin),  evolve  as 

8y(n  +l)=M(y(n))5y(n) j...’  _  \  ‘  . 

where  ••  ‘  '» -r-r'.j  yr.-  ;... 

-  ■  ■■ •  ■-!  '•  ’\r.  . 

dyt  ■  .  ..  .;>.<,<■  ■- 

\ 

is  evaluated  along  the  orbit  of  interest.  The  Lyapunov 
exponents  are  found  from  the  eigenvalues  of  the  matrix 
M*(y(l)) 

'  M*(y(  l))=M(y(#n)M{y{Jf  —  1))  *  •  *  M(y(  1 ) )  , 

which  has  information  about  the  orbit  generated  by 
F(y, a)  beginning  at  y(l).  Indeed,  calling  the  Lyapunov 
exponents  X,,  1  =  1,2, . . .  ,d  the  eigenvalues  of  M*(y(l)) 
are  exp lrKXt)  in  the  limit  as  K—*oo.  The  r  in  this  ex¬ 
pression  is  the  same  one  we  set  to  unity  in  Sec.  II.  For 
the  Henon  system  (a  map)  r  is  1,  while  for  the  Lorenz  sys¬ 
tem  (an  ODE)  r=0.03  (cf.  the  Eckmann  method  in  this 
section). 

The  familiar  method  of  finding  these  X,’s  (Refs.  10,  15, 
and  16)  is  to  apply  the  matrix  M*  to  an  arbitrary  vector 
w.  Then  forming 

-uta  (?) 

tK  Ml 

yields  the  largest  exponent  X,  for  large  K.  To  find  the 
next  largest  exponent  X2  one  applies  M*  to  the  elements 
of  an  outer  product  w'Xw2,  and  calculates  the  logarithm 
of  the  norm  of  this  vector  for  large  K.  This  gives  the  sum 
of  Xj  and  Xj.  Continuing  in  this  fashion,  the  full 
Lyapunov  spectrum  may,  in  principle,  be  extracted. 

While  the  expression  of  the  X/s  as  logarithms  of  the 
norms  of  various  vectors  to  which  M*  has  been  applied  is 
correct,  it  presents  serious  problems  in  evaluating  the 
derivatives  with  respect  to  the  parameters  a  of  the  map¬ 
ping  F(y,a)  from  which  M  is  formed.  So  we  take  a 
slightly  different  approach. 

We  note  that  the  trace  of  the  matrix  M*  contains  the 
information  on  Lyapunov  exponents  we  desire.  Our  first 
observation  is  that  the  expression 

d 

,  tr(M*)=  j  cxp{rKXm) 

m  —  I 

allows  us  to  find  the  largest  exponent  X|  by 

X^—biftrtM*)]  (10) 

in  the  formal  limit  that  K-*  » .  This  expression  is  much 
more  conducive  to  differentiation  with  respect  to  the  pa¬ 
rameters  a  (recall  that  M  is  a  function  of  a)  since  we  have 
to  deal  with  the  logarithm  of  a  simple  scalar,  the  trace  of 
M*,  rather  than  the  norm  of  a  vector  ||M*w||  as  in  Eq. 
(9). 

One  can  find  an  expression  for  the  next  exponent  X}  by 
observing  that  the  combination 


ftrtM^-trfM2*) , 


where  ,JSj 

Mw:(y(l))=M(y(2 K))  M 

XM(y(2/f  — 1))  •  •  •  M(y(2))M(y(l)) 

behaves  as  exp[rXT(X|  -frXj)}*  for  large  K.  So  we  can  find  3 
thesumofXj+Xjby  1 

X,  ■ +  X2 ; =  —  !n  { [  tr(  M*)  ]2 -  tr(  M2*) )  J 

for  large  K.  It  is  straightforward  to  construct  expressions  ■ 
for  the  sum  of  exponents  up  to  order  m  by  recognizing  . 
the  terms  in  the  above  logarithms  as  those  of  an  expan- 
sion  of  the  trace  of  the  mlh  power  the  matrix  - 
(M%-tr<M*)8y. 

In  any  case,  our  procedure  is  now  clear.  Use  whatever  * 
means  available  to  evaluate  the  X,’s  from  the  data.  Then 
form  the  indicated  logarithms  of  combinations  of  traces 
of  the  matrices  M*,  M2*,  etc.  as  computed  from  the 
parametrized  mapping  F(y,a).  Equating  the  X,’s  evalu¬ 
ated  from  the  data  to  the  expressions  for  the  X,’s  in  terms 
of  F(y,a)  gives  us  a  set  of  d  constraints.  We  impose  these 
constraints  on  our  choice  of  the  parameters  a  in  conjunc¬ 
tion  with  the  minimization  of  our  cost  function. 

Our  actual  practice  restricts  attention  to  the  largest 
Lyapunov  exponent  X,  since  that  is  the  only  one  we  know 
how  to  reliably  extract  from  data.  Thus  only  the  trace  of 
M*  is  needed  in  our  constraints.  It  seems  to  us  a  matter 
of  some  importance  to  devise  accurate  methods  to  deter¬ 
mine  the  full  spectrum  of  Lyapunov  exponents  from  data. 
They  would  be  useful  in  the  program  we  are  engaged  in, 
and  they  act  as  classifiers  for  nonlinear  dynamical  sys¬ 
tems  with  broadband  power  spectra.  In  the  case  of 
broadband  spectra,  sharp  lines  are  not  available  for  clas¬ 
sifying  and  one  must  turn  to  the  kind  of  dynamical  in¬ 
variant  we  have  here. 

IV.  INVARIANT  DISTRIBUTIONS  ON  THE  ATTRACTOR 

The  frequency  with  which  orbits  y(n)  visit  regions  of 
the  phase  space  Rrf  defines  an  invariant  distribution  func¬ 
tion,  pi y),  which  is  formally  defined  for  the  mapping 
yin  +l)=F(y(n))  as 

pi y)=  Hm  -j-  £  8rf(y-F*(y(l)))=  lim  ps(y)  .  (11) 

W-®  iY  ^  =  j  W  —  » 

In  a  similar  fashion,  the  invariant  distribution  for  a  nu¬ 
merical  data  set  y(»), «  =1, . . .  .//is  given  by 

t  N 

pi y)=  lim  -£■  2  Sd(y-y (X))  .  (12) 

N-n  N k~i 

Eckmann  and  Ruelle10  discuss  the  features  of  ply)  at 
some  length.  In  particular,  they  address  the  question  of 
the  dependence  of  ply)  on  the  initial  point  y(l).  They 
state  that  any  two  initial  points  in  the  basin  of  attraction 
will  lead  to  the  same  ply).  In  this  sense  pi  y)  is  invariant. 
For  a  dynamical  system  with  two  attractors  it  is  possible 
for  their  basins  of  attraction  to  be  intertwined  in  a  com¬ 
plicated  way.  Any  uncertainty  in  the  initial  point  y(l) 
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due  to  noise,  machine  round  off,  etc.,  may  effect  our  abili¬ 
ty  to  say  with  certainty  the  attractor  to  which  a  particu¬ 
lar  y(  1)  will  go.  Also,  in  the  absence  of  noise  there  may 
be  particular  y(  1  )’s  (often,  but  not  exclusively,  on  a  set  of 
measure  zero  in  Rrf)  that  lead  to  nongeneric  orbits.  An 
example  of  this  type  of  nongeneric  behavior  would  be  an 
unstable  fixed  point  or  periodic  orbit  in  the  presence  of  a 
strange  attractor.  In  any  event  we  will  assume  that  noise 
levels  are  low  and  the  only  nongeneric  orbits  are  unstable 
and  of  measure  zero  in  the  phase  space.  In  this  case, 
once  a  particular  y(l)  has  moved  beyond  its  transient 
stage  the  frequency  with  which  it  visits  various  portions 
of  the  attractor  is,  by  definition,  p(y). 

The  complete  invariant  density  ply)  has  too  much  in¬ 
formation  in  it  for  our  purposes.  (We  could  not  constrain 
a  cost  function  to  reproduce  every  point  on  the  invariant 
density  without  an  inordinate  amount  of  work.)  Any 
finite  sequence  of  N  points  has  a  finite  resolution  on  the 
attractor.  That  resolution  is  approximately  N  A, 
which  is  the  order  of  the  mean  distance  of  N  points  on  a 
d ^-dimensional  sec.  Furthermore,  we  will  never  actually 
resolve  the  detailed  S-function  resolution  implied  by  Eqs. 
(11)  and  (12). 

To  handle  this  matter  of  finite  resolution  we  introduce 
a  complete  orthonormal  set  of  functions  t^p(y)  defined  on 
Rrf  which  can  serve  as  a  basis  set.  We  then  expand  p(y) 
in  terms  of  this  basis 

p( y)=  i  B^( y) .  (13) 

Truncating  this  expansion  at  some  finite  order  (p=G) 
provides  a  finite-resolution  representation  corresponding 
to  whatever  information  we  have  on  p(y).  The 
coefficients  B ^  will  be  the  invariants  of  the  dynamical 
process  which  characterize  p(y)  within  a  given  basis 
V^(y).  After  our  discussion  of  how  to  select  the  ^(y)'s 
we  will  establish  how  one  extracts  B^’s  both  from  the 
data  vectors  y(n)  and  from  the  parametrized  map  F(y,a). 
Equating  the  Bfs  from  the  data  to  those  from  the  map 
will  be  our  final  constraints  on  the  cost  function  C(X, a). 

While  any  complete  orthonormal  set  of  functions  ^(y) 
would  do  to  determine  our  Bfs,  some  are  more  appealing 
than  others.  For  example,  Fourier  series  formed  by  tak¬ 
ing 

m=(m1,m2,...,mn) 

are  formally  fine.  However,  since  the  attractor  is  occupy¬ 
ing  only  a  small  portion  of  Rd,  most  of  the  work  per¬ 
formed  by  the  Fourier  representation  of  p(y)  will  be  ex¬ 
pended  in  making  p(y)  vanish  off  the  attractor.  What  we 
seek  are  orthonormal  functions  concentrated  on  the  at¬ 
tractor,  so  all  the  work  in  the  expansion  ofp(y)  is  expend¬ 
ed  exhibiting  structure  where  the  attractor  is  located. 
This  would  also  result  in  the  need  for  many  fewer  B M 
than  required  for  Fourier  series  or  other  familiar  choices 
for  ^(y). 

An  optimal  choice  using  information  in  the  data  set  is 
constructed  as  follows.26,27  Take  the  total  data  set  y(n), 
n  ~  1,2, ... ,  and  divide  it  into  two  portions.  The  first 
portion  (of  length  AO  will  be  treated  as  the  data  we  are 


trying  to  model.  The  second  portion  of  the  data  set  (of 
length  N ')  will  be  used  to  construct  orthonormal  func¬ 
tions.  These  orthonormal  functions  will  be  the  tp^ly  )’s 
that  we  will  use  in  our  expansion  of  p(y),  shown  in  Eq. 
(13).  To  explicitly  construct  these  functions  we  further 
divide  the  second  portion  of  the  data  into  G  groups  of 
length  L  1 N'=LG ).  Each  group  is  a  sample  of  the  invari¬ 
ant  attractor.  If  L  is  large  enough,  each  sample  is  a 
significant  look  at  ply).  Treat  each  of  the  G  data  sets  as 
an  independent  sample  of  ply)  and  form  the  invariant 
distribution  for  the  ath  sample 

Pjy)=4~  2  8d!y-y!k,a))  ,  (14) 

L  k- 1 

with  a=l,2,  ...,G.  The  data  point  y  Ik, a)  is  the  kth 
member  of  the  ath  sample.  Of  course,  the  mean  density 
of  the  G  samples  is  just  the  total  invariant  density  of  the 
data  set  of  length  N', 

pW=7:ipJy)-  U5) 

°a-l  ^ 

From  the  G  samples  pal  y)  we  form  the  following  phase- 
space  correlation  function 

Rlz,w)=~  ^  PaWpJvl  •  (16) 

l7c“l 

It  can  be  shown22,26  that  the  normalized  eigenfunctions 
of  this  correlations  function  are  the  optimal  eigenfunc¬ 
tions  for  expansion  of  functions  localized  on  the  attrac¬ 
tor.  dptimal  means  that  these  eigenfunctions  provide  the 
best  representation  in  a  least-squares  sense  of  the  infor¬ 
mation  in  p(y)  when  expressed  as  a  finite  series  in  an 
eigenbasis.  The  label  a  is  to  be  treated  as  a  sampling  in¬ 
dex  from  a  set  of  independent  looks  at  the  data  each  of 
which  is  to  be  thought  of  as  selected  from  a  uniform  sta¬ 
tistical  distribution  of  invariant  densities.  The  various 
averages  over  a  then  appear  quite  natural. 

The  requirement  that  ^(y)’s  be  an  eigenfunction  of 
i?(z,w)  leads  to 


f  ddzRl w,z)^/J(z)=/i^(w)  . 

(17) 

The  ^lyYs  are  normalized  es  follows: 

f  ddw  ^(w)^.(w)=8w- . 

(18) 

As  the  number  of  samples  G  becomes  infinite,  the  set  of 
eigenfunctions  becomes  complete  in  the  usual  least- 
squares  sense.  If  we  insert  Eq.  (16)  into  Eq.  (17),  we  see 
that  for  finite  G,  R1  w,z)  becomes  a  finite  sum  of  separ¬ 
able  kernels.  It  is  easily  seen  that  in  this  case  the  eiven- 

■>  V 

functions  xp^ly)  must  have  the  form 

*„<y>=  2  Clpfy)  .  (19) 

a*  1 

The  eigenfunctions  defined  in  this  fashion  are  localized 
near  the  attractor,  just  as  we  wished.  This  follows  direct¬ 
ly  from  Eq.  (19)  since  ip.,1  y)  is  made  of  the  palyYs  which 
vanish  off  the  attractor. 

Inserting  Eqs.  (16)  and  (19)  back  into  Eq.  (17)  reduces 
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the  eigenvalue  equation  ‘o  a  finite  matrix  problem  The 
*  coefficients  C£  are  the  G  vectors  which  are  eigenvectors 
,  of  the  G  X  G  matrix 


2  AriC^pC*  .  '  (21) 

0-1 

We  now  turn  to  the  normalization  condition  Eq.  (18).  In¬ 
serting  the  representation  for  ^(y)  given  by  Eq.  (19)  into 
Eq.  (18),  and  using  the  relationship  between  the  C£’s  and 
A  off  given  by  Eqs.  (20)  and  (21),  dictates  that  the  vectors 
C£ J  obey  the  following  normalization  condition: 

2  CgC{j’“(/iG)-V.  :  (22) 

a  —  I 

(Incidentally,  this  equation  also  shows  that  all  the  eigen¬ 
values  p  are  positive.) 

Formally,  the  elements  of  pa( y)  are  5  functions. 
Hence,  numerically  speaking,  computation  with  them  is 
really  not  possible.  We  choose  to  replace  6d(x)  by 

8d(x)-+ - - - - -1*1*0= /*  (|x|) 

{ir<o)dn  /aUX|;’ 

which,  when  5  is  small,  represents  only  a  small  loss  of 
resolution  in  calculating  pa(  y).  /_  also  has  the  same  in¬ 
tegral  as  the  6  function  it  replaces.  To  this  approxima¬ 
tion 

P„(y)=y  i  /5(|y-y(*,a)|)  ,  (23) 

and  Eq.  (20)  becomes 


AoB~ 


X  2  exp[  — |y(k,a)— y(y,)9)|2/55]  .  (24) 

*.7-1 

We  are  now  in  a  position  to  calculate  our  optima] 
eigenfunctions  ^(y)  from  the  G  data  sets.  Use  Eq.  (24) 
to  numerically  calculate  the  G  X  G  matrix  A  ^ .  Next  cal¬ 
culate  the  eigenvalues  p  and  eigenvectors  C£  of  this  ma¬ 
trix,  being  sure  to  normalize  them  according  to  Eq.  (22). 
We  can  then  form  the  eigenfunctions  tp^y)  by  using  the 
normalized  C£’s  and  the  pa(y)’s  [in  the  form  of  Eq.  (23)] 
in  Eq.  (19). 

In  Fig.  ii  we  show  pj(y)  evaluated  for  the  Henon  at¬ 
tractor  from  a  data  set  £=750  steps  in  length.  These 
data  are  displayed  on  a  grid  of  75  points  in  each  coordi¬ 
nate  direction.  The  other  densities  are  qualitatively  simi¬ 
lar  in  that  they  are  very  spikey.  However,  the  exact  posi¬ 
tion  and  size  of  the  spikes  varies  from  one  sample  to  the 
next.  The  tp^yYs  look  like  the  pa(y)’s  except  that  they 
are  allowed  to  be  negative  is  some  regions.  This  is  not 
surprising  since  they  are  composed  of  the  pa( y)’s  and  the 
weights  (given  by  the  C{j’ s)  are  not  required  to  be  positive 


[cf.  Eq.  (19)].  >'.fl 

We  now  have  a  set  of  G  orthonormal  functions  ^(y)$ 
extracted  from  G  samples  pa( y)  of  the  invariant  distribute 
tion.  We  can  use  the  orthonormality  condition,  Eq.  (18); | 
to  project  a  particular  B^  out  of  Eq.  (13), 

B*= f  dJyp(yWy)  •  (25J  s 

.  . 

Incidentally  this  shows  the  B M  are  invariants  of  the  map¬ 
ping  since  they  are  integrals  of  ip with  the  density  p( y)  • 
(cf.  Sec.  I).  If  we  insert  Eqs.  (12)  and  (19)  into  this  expres¬ 
sion  we  get 

=  2  Cgpa(y(k)) 

a-lk-1 

I 
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=7^2  Ht—  •: 


'  k  -!  la- 1  y- 1 


'i  ( mo)d/ 2 

Xexp[-|y(;,a)-y(k)|2/5] 


where  we  have  used  Eq.  (23).  Equation  (26)  has  been  used 
to  numerically  calculate  the  BJ s  from  the  data. 

This  should  make  it  clear  how  one  evaluates  the  B^'s 
from  the  iV'=G£  data  vectors  y{k,a)  in  Rd.  The  B^’s 
are  the  G  numbers  characterizing  the  invariant  density 
p(y)  by  its  projection  on  the  optimum  basis  vectors 
^(y).  Now  we  wish  to  see  how  to  evaluate  them  from 
our  parametrized  mapping  F(y,a).  The  G  equalities  be¬ 
tween  these  two  evaluations  of  Bfl  form  our  final  con¬ 
straints  on  the  minimization  of  the  cost  function  C(X,a). 

To  determine  B^  from  the  map  F(y) — we  suppress  the 
parameters  a  for  a  moment — we  return  to  the  definition 
of  the  invariant  density  as  expressed  by  Eqs.  (11)  and  (12). 
Call  Ak  the  projection  on  ip^y)  of  each  terra  in  the  sum 
in  this  equation: 

Ak(p)=jddy^(  y)8rf(y-F*(y(l))) 

=^(F*(y(l)» .  (27) 

We  interpret  Ea.  (27)  as  saying  that  Ak{p)  is  the  projec¬ 
tion  of  5rf(y — F  (y{  1 )))  onto  the  orthonormal  eigenfunc¬ 
tions  rp^y).  Using  this  interpretation  we  expand  the  8 
function  in  terms  of  ip^ y)  to  get 

8d(y-F*(y(l)))=2  AkW^ y) . 

p- 1 

For  large  N,  Eq.  (1 1)  can  now  be  written  as 

pi y)=  2  TF  2  AkW  %iy)  • 

ii—  i  JV  *"i 

Comparing  this  equation  to  Eq.  (13)  indicates  that 
1  N 

Bn  =  ~  2  X*(p) 


=  TF  2  ^(F*(y(I))) . 

/v  k  *»1 


FIG.  1 1 .  First  density  for  the  Henon  map  for  L  =  750  and  G  —  5. 


From  Eq.  (19)  and  the  definition  of  pjy)  we  rewrite  this 
equation  as 

=  2  2  Cg6rf(yaa)-F*(y(l)))  . 

y « i t 


This  expression  requires  quite  high  powers  of  F(y,a)  to 
be  evaluated,  and  we  cannot  be  confident  that  such  high 
powers  of  the  parametrized  map  are  numerically  accu¬ 
rate.  When  the  map  is  near  the  correct  or  optimum  map, 
then  we  are  accurately  reproducing  y(k  + 1 )  from  y[k)  by 
a  single  application  of  F(y,a).  We  utilize  this  to  replace 
F*(y(l))  in  the  last  formula  with  F(y(k)).  The  expres¬ 
sion  for  5„’s  then  becomes 


1  N 
*«! 

1  N 
-  2 
*“  i 


NL 


2  2  C^d(yU,a)-F(y{k))) 

a=  1  7“  I 

G  L  C£ 

12” 


dn 


o**iy=i  («■«)' 

Xexp(  —  !y(./,a)-F(y(k))|2/5) 


(28) 


using  our  smoothed  replacement  for  the  6  functions  in 
the  invariant  density. 

This  is  the  equation  we  will  use  to  calculate  the  B^’s 
from  the  map.  To  implement  it  we  take  all  N  points  in 
the  first  portion  of  the  data  set  and  iterate  them  once 
through  the  map  F(y,a).  When  F(y,a)  is  near  the 
correct  map  iterating  the  data  set  once  will  result  in 
points  that  are  still  on  the  attractor.  We  then  evaluate 
the  Gaussian,  and  numerically  sum  all  the  contributions 
between  the  iterates  of  the  data  set  and  the  points  in  the 
G  samples. 

We  close  this  section  with  an  observation  about  the 
B^s.  By  combining  the  definition  Eq.  (27)  with  the  iden¬ 
tity 

5rf(y  — F*  +  '(y(l))) 

=  / ddw  8d(y— F(w))8d(w— F*(y(  1 ))) , 
we  can  derive  the  recursion  relation 

~^k+ \(lt’)=2Twi'  Akty)  > 

n' 

in  which  the  transition  matrix  Tis  given  by 


-  laoo 
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T^fd'yt'S  yty,(F(y)).  (29) 

, ,  jEn  a  matrix  notation  the  recursion  relation  Ak  +  l  =  TAk 
leads  to  an  expression  for 

f  dJypL<y)tyy)=:pL(fi)=:j-  ,2  AkW . 

which  is  * 

(\-T)pL(p)=j-(\-TL)Av. 

Since  limL_^<cpL{p)=  / dJy  p{y)$fl{y)=B(i,  this  shows 
that  the  BM  arc  the  components  of  the  eigenvector  of  T 
with  eigenvalue  unity  and  further  that  all  other  eigenval¬ 
ues  must  lie  within  the  unit  circle,  if  the  expression,  Eq. 
(11),  forp(y)  converges.  By  our  assumption  that  thep(y) 
we  observe  is  unique,  we  infer  that  the  eigenvalue  unity  of 
T  is  nondegenerate.  We  tried  to  implement  this  observa¬ 
tion  about  the  B ^  to  yield  a  method  for  numerically 
determining  them  from  Eq.  (29)  (Refs.  18  and  19),  but 
found  roundoff  error  undermined  our  efforts. 


V.  OPTIMIZATION  OF  THE  CONSTRAINED 

COST  FUNCTION:  PARAMETER  DETERMINATION 

A.  Analysis  for  the  Hinon  map 

Our  first  application  of  the  methods  described  above  is 
to  data  generated  by  the  Henon  map  of  the  plane  to  itself. 
Data  were  created  by  iterating  the  map  from  some  initial 
condition  and  discarding  the  first  50  points  of  that  data 
set.  Two  data  sets  of  X|(n)  were  created  this  way.  The 
first  had  #'=3750  points  which  we  divided  into  five 
groups  of  750  points  each.  These  groups  were  used  to 
create  the  densities  pa( y),  and  the  phase-space  correlation 
function  among  groups  was  used  to  generate  the  eigen¬ 
functions.  The  second  data  set  was  then  used  to  select 
samples  of  length  N— 750,  1200,  and  1752  for  our 
analysis. 

We  first  studied  the  distribution  of  Euclidian  dis¬ 
tances  among  the  two  vectors  y(n)—(xi(n),xl(n  +1)), 
n  =  1,2, . . . ,  N  —  1  formed  from  the  data  set.  On  the  nat¬ 
ural  scale  of  the  attractor,  which  is  order  unity,  the 
minimum  distance  was  always  order  10“5-10~4.  This 
led  us  to  choose  the  parameter  a  in  our  maps  to  be 
a = 5  X 10-6  so  that  each  data  point,  at  least  for  N  > 500, 
would  have  neighbors.  We  varied  a  by  a  factor  of  10  or 
so  with  no  qualitative  differences  in  our  results.  A 
thorough  parameter  search  would  vary  a  in  the  con¬ 
strained  minimization  of  the  cost  function. 

Next  we  chose  to  use  four  parameters  a  in  our  set  and 
took  the  powers  m 3  and  m4  in  F(y,a)  to  be  m3= 4, 
m4  =  5.  We  did  not  further  vary  these  parameters.  Our 
choice  of  four  a’s  rested  on  our  knowledge  that  we  would 
be  constraining  our  cost  functions  by  only  the  largest 
Lyapunov  exponent  A.,  and  the  projection  Bx  of  ply)  on 
the  eigenfunction  ^,(y)  with  the  largest  eigenvalue.  Four 
seem  d  a  minimum  reasonable  number  of  parameters, 
and  s.nce  the  work  required  to  search  large  parameter 
sets  can  become  significant,  we  were  content  with  four. 


In  effect,  we  had  two  free  parameters  in  F(y,a)  when  th*j 
values  of  A.,  and  B  t  were  specified. 

Our  final  a  priori  choice  was  on  the  values  of  X-  in  the  cj 
predictor  Eq.  (3).  We  took  three  terms  here  since  we:^ 
were  being  quite  conservative  in  how  many  iterations  of  1 
the  map  F(y,a)  we  felt  we  could  trust.  Then,  further, 
reflecting  our  sense  that  iterations  of  F(y,a)  could  be¬ 
come  unreliable,  we  chose  X|=0.8,  X^—0. 1,  and 
X3  =0. 1.  Once  again  the  X’s  could  be  parameters  which  ■ 
vary  in  our  constrained  minimization.  We  found  that.^ 
varying  the  X’s  by  20%  or  so  did  not  qualitatively 
change  our  results.  In  the  case  of  the  Lorenz  attractor 
study  discussed  in  Sec.  V  B  we  report  results  for  Xx  =0.5, 
-Y2— 0. 3,  and  X3  =0.2,  and  note  that  the  cost  function 
changes  by  « 20%. 

We  chose  to  simply  fix  the  X’s  for  purposes  of  this  pa- 
per.  Clearly,  the  X’s  can  be  varied  along  with  the  a’s,  ay  . 
and  m3, . . .  ,mf,  if  one  wishes.  Ours  is  a  first  try  with 
the  F(y,a)  we  have  chosen  in  fitting  the  data  and  meeting 
the  invariant  constraints.  The  feasibility  of  accomplish¬ 
ing  this  seemed  daunting  enough  when  we  set  out.  We 
expect  to  include  many  more  parameters  in  future  work 
in  this  area. 

One  additional  important  matter  deserves  note  before 
we  proceed  to  the  discussion  of  our  numerical  results. 
The  maps  F(y,a)  as  we  carry  out  our  search  over  the  pa¬ 
rameters  a  have  very  little  ability  to  reliably  fit  the  given 
data  for  most  a.  Only  when  we  arrive  near  a  (con¬ 
strained)  minimum  of  the  cost  function  can  we  be  very 
confident  that  our  map  is  reasonable.  Until  the  map  is 
near  the  optimum  map  points  in  the  data  set  are  quite 
often  mapped  far  off  the  attractor.  For  numerical  stabili¬ 
ty  in  our  search  algorithms  we  need  a  method  to  identify 
orbits  which  are  leaving  the  attractor  for  nonoptimal 
values  of  a  and  return  them  to  the  neighborhood  of  the 
attractor. 

Maps  of  the  form  we  have  chosen  have  the  feature  that 
points  far  off  the  attractor,  as  defined  by  the  data  set  it¬ 
self,  are  mapped  to  y=0.  There  is  no  reason  to  expect 
the  origin  of  coordinates  to  lie  on  an  attractor  which  has 
dA<d  and  is  quite  sparse  in  Rd,  but  we  choose  to  always 
translate  our  data  set  so  one  of  its  points  is  the  origin. 
This  changes  nothing  about  the  signal  processing  issues 
we  address  in  •  this  paper  and  makes  our  parameter 
searches  numerically  sensible. 

With  this  translation  of  the  origin,  an  orbit  being  gen¬ 
erated  by  F(y,a),  when  a  is  not  optimal,  which  tries  to 
depart  significantly  from  the  attractor  is  sent  back  to 
y=0,  which  is  now  on  the  attractor.  When  a  is  near  its 
optimal  values,  this  feature  is  operationally  unimportant 
because  the  map  is  tracking  the  data  very  accurately. 

Our  experience  indicates  that  if  one  is  trying  to  create 
global  maps  F(y,a),  as  we  are  here,  some  form  of  “orbit 
reinjection”  will  be  required  to  give  numerical  sense  to 
the  whole  process  of  searching  parameter  space  to  mini¬ 
mize  the  cost  function.  The  problem  becomes  more  im¬ 
portant  as  d  grows,  since  the  attractor  of  dimension 
da  <  d  occupies  “less  and  less”  of  the  full  volume  of  the 
phase  space.  If  one  is  making  “fits”  to  the  data  by 
numerous  local  or  nearly  local  polynomial  maps  as  in  the 
work  of  FS,  the  issue  raised  here  is  absent.  Global  maps 
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Juve  an  economy  of  parameters  and  a  potential  ease  of 
interpretation;  local  maps  appear  to  have  an  advantage  of 
calculational  speed.  We  have  no  overall  judgment  of  a 
way  to  choose  between  these  alternatives. 

Our  results  for  data  from  the  Henon  map  are  shown  in 
Table  III.  The  parameter  searches  were  carried  out  using 
the  FORTRAN  package  NPSOL.41'  One  of  its  authors,  Gill, 
was  kind  enough  to  consult  with  us  extensively  on  its  use 
and  on  the  interpretation  of  its  output.  For  each  value  of 
the  number  of  points  in  the  data  set,  namely,  N= 750, 
1200,  and  1752,  we  report  seven  quantities  for  each  of 
three  cases:  (1)  unconstrained  minimization  of  the  cost 
function;  (2)  minimization  constrained  by  Aj 
(X““P= 0.408);  and  (3)  minimization  constrained  by  both 
A(  and  Bt.  In  each  case  we  report  the  value  of  the  cost 
function  normalized  by  the  sum  of  the  squares  of  the  Eu¬ 
clidian  lengths  over  all  data  vectors,  the  values  of  the  a’s 
at  the  minimum  cost  function,  and  the  deviations  AA| 
and  A B,  from  the  values  of  Xj*u  and  2?f*u  determined  by 
the  data.  The  allowed  tolerances  on  these  deviations  are 
set  in  NPSOL  by  the  user.  We  typically  required  the  rela¬ 
tive  magnitudes  of  AA,  to  A,  and  the  same  for  5,  to  be  in 
the  range  0.5-5  %.  This  is  not  a  limitation  of  NPSOL,  but 
it  seemed  quite  accurate  enough  for  our  purposes. 

A  look  at  Table  III  reveals  a  consistent  pattern.  Un¬ 
constrained  optimization  resulted  in  a  cost  function  with 
a  rms  deviation  of  our  predictor  from  the  data  of  0.1%  or 
smaller.  Not  surprisingly  when  we  track  the  data  so  ac¬ 
curately,  the  value  of  2?,  comes  out  quite  precise.  The 
"value  of  A™,p  for  this  best  least-squares  fit  is  remarkably 
bad.  Indeed,  in  our  examples  this  quantity  was  actually 


negative,  which  indicates  the  absence  of  chaos  for  the 
parametric  map. 

When  the  A,  constraint  is  imposed,  the  parameters  a 
change,  but  we  regard  their  specific  values  as  of  inciden¬ 
tal  interest  here.  More  important  is  the  observation  that 
the  rms  value  of  the  cost  function — the  bare  measure  of 
the  quality  of  the  fit — remains  about  0.5%  while  the 
Lyapunov  exponent  is  now  accurate  to  about  1%  or 
better.  Of  course,  having  moved  away  from  the  very  best 
point-to-point  least-squares  tracking  of  the  data,  the.ac- 
curacy  of  B,  degrades  to  «10%.  Finally,  imposing  both 
constraints  we  achieve  0.5%  or  so  in  the  rms  error  for  the 
cost  function,  highly  accurate  At,  and  somewhat  better 
Bt  values. 

The  message  of  these  calculations  is  that  the  procedure 
we  outlined  in  this  paper  is  both  feasible  and  highly  accu¬ 
rate.  The  few  scalar  numbers,  ‘he  cost  function,  Aj,  and 
Bt  do  not  tell  the  whole  story.  One  can  take  the  map 
with  the  optimum  a’s  and  calculate  a  new  orbit  starting 
from  some  new  phase  space  point  ynew(  1 ):  yn*w(  1), 
)|K*(  2), . . . ,  and  compare  the  new  orbit  to  that  generat¬ 
ed  by  the  Henon  map  starting  with  the  same  initial  point. 
The  data  so  generated  look  the  same  when  plotted  as  a 
sequence  of  two  vectors,  but  this  temporal  representation 
contains  very  little  useful  information,  so  we  do  not  show 
it. 

What  is  more  important  is  the  fact  that  our  predictor 
y(m  +1)=  2  **F*(y(m-iH-l),a)  (30) 

i 

accurately  predicts.  We  have  taken  numerous  points 


TABLE  III.  Optimization  results  for  Henon  map  data.  C(X,a)  is  shown  with  and  without  invariant  constraints. 
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from  our  data  set  and  evolved  them  forward  by  use  of  the 
predictor.  We  find  we  are  able  to  track  the  actual  data  to 
the  1%  level,  seven  to  ten  steps  along  the  orbit  all  around 
the  attractor.  This  means  that  iterates  of  our  optimum 
map  F*(y,a)  are  accurate  to  A»7-10,  far  beyond  our 
original  safe  choice  of  k  =3.  The  implications  of  this  re¬ 
markable  accuracy  for  prediction  and  control  of  non¬ 
linear  chaotic  systems  are  transparent. 

f 

B.  Prediction  for  the  Lorenz  systeai 

We  now  turn  to  the  application  of  our  methods  to  the 
Lorenz  system,  defined  by  Eq.  (8).  These  equations  were 
originally  motivated  by  an  attempt  to  model  atmospheric 
phenomenon  using  only  a  few  degrees-of-freedom  dynam¬ 
ical  system.  It  was  one  of  the  first  systems  known  to  ex¬ 
hibit  an  attractor  of  fractal  dimension,  or  a  strange  at¬ 
tractor,  and  consequently  to  connect  this  with  the  ap¬ 
parent  chaotic  motion  of  the  resultant  dynamics.  The 
primary  concern  in  modifying  our  previous  techniques 
for  use  on  the  Lorenz  system  will  be  (i)  the  jump  to  a 
three-dimensional  embedding  space,  which  will  require 
much  longer  time  scries  to  properly  fill  out  the  attractor, 
and  (ii)  the  large  difference  in  the  macroscale  of  the  two 
attractors,  which  will  require  the  rescaling  of  some  of  the 
variables  we  have  previously  defined.  We  will  first,  how¬ 
ever,  give  a  short  review  of  some  of  the  characteristics  of 
the  Lorenz  system. 

For  the  parameter  values  <7  =  16.0,  r=45.92,  and 
b— 4.0,  the  Lorenz  system  possesses  a  strange  attractor 
wnicl*  has  become  one  of  the  classic  examples  of  non¬ 
linear  science.  The  structure  consists  of  two  nearly  fiat 
lobes  connected,  roughly  at  a  point  and  angled  somewhat 
with  respect  to  one  another.  Hence  the  local  dimension 
of  the  attractor  is  essentially  two,  however,  the  minimum 
embedding  space  required  is  three.  Note  that  the  motion 
of  the  phase-space  orbits  for  the  Lorenz  systems  is  con¬ 
tinuous,  i.e.,  a  flow,  as  opposed  to  that  of  the  Henon  sys¬ 
tem  which  is  a  mapping.  The  discretization  of  the 
Lorenz  orbits  after  phase-space  reconstruction,  and  the 
density  of  points  along  an  orbit,  is  therefore  due  to  the 
choice  of  a  sampling  rate  in  the  measurement  of  the  time 
series  of  data.  This  sampling  rate  therefore  can  be 
thought  of  as  setting  a  time  scale  in  the  reconstructed 
picture  of  the  attractor.  In  turn,  this  time  scale  deter¬ 
mines  the  time-delay  values  for  the  method  of  phase- 
space  reconstruction  used  in  Sec.  II,  the  evolution  times 
for  Lyapunov  exponent  calculations,  etc.  A  discussion  of 
optimal  ranges  of  sampling  rates,  and  the  problems  which 
occur  when  sampling  rates  are  too  large  or  small,  is  given 
at  some  length  by  Mayer-Kress/0  In  practical  applica¬ 
tions,  of  course,  one  often  has  no  control  over  the  data  set 
one  is  presented  with,  although  too  frequent  sampling 
can  often  be  remedied  by  simply  throwing  away  data. 

To  investigate  the  behavior  of  our  prediction  technique 
on  a  system  with  a  somewhat  larger  embedding  space,  we 
chose  the  Lorenz  system  as  a  test  case  with  known  pa¬ 
rameters,  as  was  done  with  the  Henon  system.  An  “ex¬ 
perimental”  time  series  was  generated  for  the  Lorenz 
equations,  Eq.  (8),  using  the  parameter  values  listed 
above,  by  a  Runge-Kutta  numerical  integration  scheme 


with  a  fixed  time  step  of  0.03.  A  data  set  of  the  x,  vari¬ 
able  consisting  of  approximately  20000  points  (after  tran¬ 
sients)  was  generated.  We  used  the  same  time  series  for 
all  of  our  numerical  runs. 

As  we  have  stated  above  (cf.  Sec.  II)  we  chose  an 
embedding  dimension  of  three.  Note  that  in  an  actual  ex¬ 
perimental  situation,  a  more  cautious  choice  of  four 
would  also  be  reasonable,  although  this  would  have  in¬ 
creased  our  computational  requirements  by  a  significant 
amount.  For  the  choice  of  the  delay  time  constant  r  a 
number  of  different  choices  could  be  made.  Since  the  de¬ 
lay  time  reconstruction  is  rather  weakly  dependent  on 
this  constant,  provided  one  is  within  certain  limits,  there 
is  no  unique  choice  for  this  variable.  Our  final  choice  was 
motivated  by  the  desire  to  have  the  reconstructed  attrac¬ 
tor  look  most  like  the  original  Lorenz  attractor.  This  re¬ 
sults  in  a  time  delay  of  two  time  steps.  For  an  actual  case 
where  one  would  have  no  a  priori  sense  of  what  the  at¬ 
tractor  looks  like,  the  methods  of  Sec.  II  are,  of  course, 
recommended.  A  feel  for  the  required  density  of  points 
can  also  be  obtained  by  calculating  the  minimum 
nearest-neighbor  distance,  and  perhaps  the  frequencies 
that  a  range  of  somewhat  larger  neighbor  distances 
occur,  and  comparing  this  with  the  “macroscale”  of  the 
attractor  (i.e.,  the  maximum  ranges  of  the  coordinates  of 
an  attractor). 

The  embedding  dimension  and  delay  time  comprise  the 
two  parameters  necessary  to  correctly  reconstruct  the  dy¬ 
namics  of  the  systems  attractor,  and  hence  is  the  first  step 
in  setting  up  the  prediction  method.  We  now  turn  to  the 
changes  necessary  in  the  numerical  algorithm  when  we 
consider  the  Lorenz  system. 

The  most  significant  difference  between  the  prediction 
models  for  the  Henon  system  and  for  the  Lorenz  system 
is  that  of  the  size  of  the  time  series  required  for  the 
phase-space  reconstruction.  Because  of  the  increase  in 
the  dimensionality  cf  the  embedding  space  from  two  to 
three,  the  number  of  phase-space  points  required  to  per¬ 
form  our  procedure  increases  dramatically.  The  reasons 
for  this  is  clear.  Our  prediction  function  F(y,a)  requires 
that  most  points  have  a  significant  number  of  nearby 
neighbors,  i.e.,  points  within  distances  of  a  few  Va  values 
so  that  a  good  “mapping”  of  the  local  phase  space 
around  a  particular  region  is  obtained.  Additionally, 
nearby  neighbors  are  important  to  obtain  good  numerical 
approximations  to  the  gradients  of  the  objective  and  con¬ 
straint  functions.  Since  the  number  of  points  required  to 
yield  a  given  mean  nearest-neighbor  distance  is  consider¬ 
ably  larger  for  a  volume  than  for  an  area,  the  number  of 
points  required  to  properly  fill  out  the  attractor  is  much 
greater  for  a  three-dimensional  embedding  space.  In  Sec. 
II  we  presented  general  methods  for  determining  the 
number  of  data  vectors  needed  for  a  given  embedding  di¬ 
mension  d.  For  our  particular  analysis  of  the  Lorenz  at¬ 
tractor  reported  here,  we  found  that  the  minimum  num¬ 
ber  of  points  that  gave  reasonable  results  to  be  about 
6000.  For  the  numerical  experiments  reported  in  Table 
IV  we  used  data  sets  with  6000  and  with  8000  points. 

One  final  change  in  the  numerical  parameters  for  the 
prediction  code  is  in  the  number  of  matrices  that  are  to 
be  multiplied  together  to  obtain  the  Lyapunov  exponent 
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TABLE  IV.  Optimization  result*  for  Lorenz  attractor  data.  CfX.a)  is  shown  with  and  without  invariant  constraints. 
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from  the  mapping  function.  Since  each  iteration  of  the 
Henon  map  represents  a  significant  evolution  of  the  sys¬ 
tem,  the  multiplication  of  500  Jacobian  matrices  for  the 
Lyapunov  calculation  represents  a  good  average  over  the 
phase  space,  and  results  in  fairly  good  accuracy  of  the 
final  value.  However,  each  step  of  the  time  series  for  the 
Lorenz  system  represents  much  less  evolution  time  for 
the  dynamics.  It  was  necessary  to  experiment  with  the 
number  of  matrices  required  to  give  good  convergence. 
It  was  found  that  about  1000  matrix  products  gave  a 
reasonably  good  convergence  to  the  final  value,  but  was 
still  not  excessively  computationally  intensive. 

To  complete  the  formulation  of  the  prediction  model 
for  the  Lorenz  data,  it  is  necessary  to  pick  the  exact  form 
of  the  mapping  and  cost  functions  that  arc  to  be  mini¬ 
mized.  We  first  discuss  the  choice  of  the  polynomial 
terms  which  multiply  the  exponential  in  the  mapping 
function.  These  terms  are  defined,  as  for  the  Henon 
analysis,  with  the  intention  of  giving  the  exponential 
form  in  the  mapping  fur  :tion  a  longer  “tail”  by  adding 
multiplicative  polynomial  terms  to  it.  As  for  the  Henon 
analysis,  we  chose  to  use  four  polynomial  terms  in  the 
mapping  function,  and  hence  have  four  variables  in  the 
minimization  fit.  The  first  coefficient  is,  of  course,  the 
constant  term,  and  the  second  again  multiplies  the  linear 
term  that  expresses  some  dependence  of  the  mapping 
function  on  the  Lyapunov  exponent.  Therefore  there 
remains  to  be  determined  the  powers  of  the  last  two  poly¬ 
nomial  terms. 


In  choosing  the  values  of  the  exponents  of  the  remain¬ 
ing  two  polynomial  terms,  we  recall  that  we  wish  to 
elongate  the  tail  of  the  exponential  term  in  the  mapping 
function  to  make  it  feel  more  of  the  surrounding  neigh¬ 
bors.  However,  we  do  not  wish  to  make  these  exponents 
so  large  that  we  increase  the  scale  well  beyond  that  which 
we  set  by  cr.  After  some  experimentation,  we  chose 
m  =3  and  6  as  the  two  powers  for  the  polynomial  terms, 
although  this  is  by  no  means  the  only  possible  choice. 

The  second  set  of  parameters  of  the  minimization  pro¬ 
cedure  which  need  to  be  chosen  are  the  X’s  which  appear 
in  the  definition  of  the  cost  function  Eq.  (4).  These 
coefficients  weight  the  different  iterates  of  the  map  F(y,a) 
and  essentially  determine  how  many  iterates  forward  we 
wish  the  map  to  accurately  reproduce  the  data.  For  the 
Henon  analysis,  we  chose  three  X’s  with  values 
(0.8, 0.1, 0.1).  Our  choice  indicates  a  desire  to  weight  the 
firs!  forward  iterate  very  heavily,  while  giving  the  second 
and  third  iterates  only  minimal  importance.  This  set  of 
values  was  chosen  primarily  becau  c  the  Henon  system  is 
a  mapping,  and  each  iterate  represents  a  large  step  in 
evolution  of  the  original  system.  On  the  other  hand,  the 
Lorenz  system  produces  a  flow  in  phase  space,  and  the 
time  step  ve  chose  for  each  iterate  of  the  time  scries 
represents  a  rather  small  amount  of  forward  evolution  of 
the  system.  Thus  we  choose  to  weight  some  of  the  multi¬ 
ple  iterates  of  the  map  more  heavily  than  we  did  for  the 
Henon  system.  We  have  therefore  presented  data  for  the 
Lorenz  system  with  two  different  sets  of  values  for  these 
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parameters.  In  one  case  we  used  the  original  weights  of 
The  Henon  system  (0.8, 0.1, 0.1).  For  the  other  case,  we 
^weighted  the  multiple  iterates  more  heavily,  namely, 
(0.5, 0.3, 0.2).  Note  that  we  could  have  easily  chosen  to 
take  more  than  two  multiple  iterates  of  the  system.  How¬ 
ever,  for  the  sake  of  simplicity  and  comparison  we  chose 
to  use  two  as  for  the  Henon  system.  We  also  point  note 
that  the  X’s  like  the  a’s  could  be  made  variables  in  the 
minimization  search;  we  will  do  that  in  our'further  work 
in  this  matter. 

Finally,  to  determine  a  value  of  the  parameter  a 
(which  sets  a  characteristic  scale  of  distance  over  which 
the  mapping  function  is  influenced  by  neighbors),  it  is 
necessary  to  experiment  with  different  values  by  actually 
doing  a  number  of  minimization  runs.  One  can,  however, 
make  an  a  priori  guess  by  considering  two  factors.  The 
largest  value  that  a  can  possibly  have  will  certainly  be 
the  scale  of  the  linear  regime  for  the  system.  This  is  very 
roughly  about  1%  of  the  attractors  macroscale,  as  men¬ 
tioned  previously.  Hence  cr  should  be  considerably 
smaller  than  this  value.  Additionally,  the  smallest  value 
that  a  can  possibly  attain  is  given  by  the  smallest  neigh¬ 
bor  distance  of  the  data  set,  and  should  be  at  least  one  to 
two  orders  of  magnitude  larger  than  this  value.  Within 
this  range,  a  must  be  chosen  with  some  experimentation. 
We  have  found  that  typically,  the  value  of  the  C(X,a)  at 
its  minima  will  be  relatively  large  for  larger  values  of  a, 
and  decreases  until  a  threshold  in  a  is  crossed.  For 
values  of  cr  smaller  than  the  threshold  value,  the  minima 
of  C(X, a)  becomes  a  great  deal  less,  sometimes  by  an  or¬ 
der  of  magnitude  or  more.  We  recommend  that  a  be 
chosen  somewhat  smaller  than  this  threshold  value,  how¬ 
ever,  not  too  much  smaller  as  it  is  still  desirable  to  have 
as  much  of  the  surrounding  phase  space  ns  possible  con¬ 
tribute  to  the  mapping  of  each  orbital  point.  For  our  ex¬ 
periments  on  the  Lorenz  system  we  used  a  — 1.0  X 10-4. 

Using  the  parameter  values  stated  above,  a  search  for 
the  minima  of  Eq.  (4)  in  the  parameter  space  a  was  con¬ 
ducted  using  the  npsol  (Ref.  41)  package.  Since  there  is 
no  general  method  known  for  determining  the  absolute 
minimum  of  a  function  using  numerical  methods,  one 
generally  proceeds  by  finding  the  minima  after  iteration 
for  each  of  a  large  number  of  initial  conditions,  while  at¬ 
tempting  to  cover  a  large  representation  of  the  phase 
space.  In  practice,  one  will  usually  find  a  number  of  local 
minima,  all  of  which  have  “basins  of  attraction”  of  vary¬ 
ing  sizes.  After  a  number  of  runs,  one  usually  will  gain 
some  intuition  as  to  which  regions  of  the  parameter  space 
evolve  to  which  local  minima.  When  some  confidence  is 
gained  that  a  large  region  of  the  parameter  space  has 
beer,  investigated,  we  label  the  minimum  with  the  lowest 
cost  function  value  the  “absolute"  minimum.  Of  course, 
generally  speaking,  one  can  never  be  sure  that  one  has 
bound  the  actual  global  minimum. 

Using  the  time  series  for  the  Lorenz  data  and  the  pa¬ 
rameter  values  we  have  just  described,  the  NPSOL  routine 
was  able  to  find  a  number  of  minima  of  the  cost  function 
C(X,a).  There  values  ranged  over  as  much  as  two  orders 
of  magnitude.  The  lowest  value  of  the  cost  function 
found  was  in  the  neighborhood  of  1.87X  10-5,  as  indicat¬ 
ed  in  Table  IV.  In  the  preliminary  analysis  there  were 


three  minima  which  had  almost  this  same  value.  A  more 
detailed  analysis,  however,  found  that  after  many  itera¬ 
tions  of  the  search  routine  two  of  these  minima  actually 
evolved  into  the  third.  Using  better  error  tolerances  in 
NPSOL,  it  was  found  that  this  point  actually  did  have  a 
slightly  lower  minima.  It  should  be  noted  that  even 
though  the  three  minima  had  cost  functions  which  agreed 
very  closely,  their  resulting  values  for  the  a’s  were  much 
different.  This  is  in  keeping  with  our  observation  that, 
for  a  large  range  of  parameter  values  around  these  mini¬ 
ma,  the  cost  function  was  very  “flat”  with  respect  to  the 
parameters,  i.e.,  C(X,a)  varied  very  little  over  a  large 
range  of  a’s.  This  has  the  unfortunate  effect  of  causing 
the  iteration  procedure  to  proceed  very  slowly,  since  the 
minima  were  very  shallow,  and  a  large  number  of  itera¬ 
tions  were  required  to  achieve  the  optimal  solution.  One 
possible  conclusion  from  this  is  that,  if  one  were  interest¬ 
ed  in  a  purely  least-squares  fit  of  the  map  to  the  data,  any 
of  the  parameter  sets  in  this  range  were  nearly  as  good  as 
the  optimal  solution. 

After  the  analysis  just  described,  we  performed  another 
changing  ‘.he  X’s  changed  to  (0.5,0.3,0.2).  These  parame¬ 
ter  values  weight  the  later  iterates  of  the  map  more  heavi¬ 
ly,  and  correspond  to  trying  to  make  the  map  predict  far¬ 
ther  into  the  future.  We  did  not  impose  the  B p  con¬ 
straints  on  the  Lorenz  system,  but  used  this  system  to  ex¬ 
plore  the  variations  on  the  cost  fu:  tion  and  the  quality 
of  our  ability  to  reproduce  the  largest  Lyapunov  ex¬ 
ponent  as  we  changed  the  weights  Xj  in  the  predictor. 

The  results  of  these  minimization  searches  are  also 
presented  in  Table  IV;  both  6000  and  8000  points  on  the 
attractor  are  used  in  our  example.  As  can  be  seen,  the 
cost  function  for  these  minima  are  about  y  higher  than 
for  the  previous  system,  and  this  is  to  be  expected  since 
the  later  iterates,  which  must  be  inherently  less  accurate, 
now  give  a  much  larger  contribution  to  the  cost  function. 

In  terms  of  relative  fitting  error,  however,  these  minima 
are  still  surprisingly  low.  The  final  parameter  values,  al-  | 
though  significantly  different  from  the  previous  system, 
are  still  similar  enough  to  give  the  same  general  character 
to  the  fitting  function. 

One  noticeable  difference  between  the  two  different 
values  of  X’s  was  in  the  fitting  of  the  map  using  the 
Lyapunov  constraint.  The  iteration  procedure  for  the 
(0.5,0.3,0.21  system  went  far  more  quickly  than  for  the  i 
(0.8,0.1,0.11  system.  This  can  probably  be  interpreted  in 
terms  of  the  fact  that  if  later  iterates  of  the  map  are 
weighted  more  heavily,  then  the  parameters  result  in 
more  sensitivity  of  the  map  to  the  Lyapunov  constraint, 
which  usually  requires  longer  evolution  times  to  manifest  f 
i.’self  for  flows.  f 1 

.i 

VL  SUMMARY  AND  FUTURE  TASKS  : 

In  this  paper  we  have  given  a  set  of  procedures  which 
one  may  use  to  process  signals  x(n),n  =  1,2, ,  having 
a  broadband  power  spectrum.  Using  numerically  gen¬ 
erated  data  from  the  Henon  map  and  from  the  Lorenz 
equations  we  have  also  demonstrated  explicitly  the  feasi¬ 
bility  of  our  procedures.  Processing  a  signal  means  that 
from  the  time  series  x(n)  we  do  the  following. 
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Find  an  integer-dimensional  embedding  space  of  time 
lagged  d  vectors 

y(n  )=(xln),x[n  +tj),  . . .  ,x(n  +rrf_t)) , 

which  fully  expose  the  geometric  structure  of  the  attrac¬ 
tor  on  which  the  data  evolves.  The  attractor  has  dimen¬ 
sion  d A  which  may  be  fractional.  Choosing  the  integer 
d  >  2 dA  + 1  is  guaranteed  to  be  sufficient  for  this  purpose, 
but  smaller  d  may  often  work. 

Find  invariants  of  the  evolution  y(  1  ),y{2), . . . ,  j(N)  in 
R4 — specifically,  the  Lyapunov  exponent  spectrum 
Aj.A^  ...,Arf  and  selected  optimum  moments 
. . . , Bg,  of  the  invariant  density  p( y),  on  the  at¬ 
tractor. 

Use  these  vectors  y(n)  and  invariants  to  construct  a 
parametrized  map  of  Rrf  to  itself  y-+F(y,a),  which  mini¬ 
mizes  a  certain  constrained  least-squares  cost  function 
based  on  the  residual  errors  of  a  nonlinear  predictor 

y(m+l)=  2  XkFk(y(m  —k  + 1  ),a)  , 

*-i 

involving  iterates  F*  of  the  map. 

The  output  of  the  signal  processing  is  the  map 
F(y,a) — both  its  form  and  the  parameters  a — and  the 
coefficients  Xj  in  the  predictor. 

A  map  F(y,a)  and  a  predictor  which  give  very  small 
least-squares  residuals  when  evaluated  on  the  data  we  call 
reliable.  We  have  explicitly  demonstrated  in  this  paper 
that  even  a  reliable  F(y,a)  does  not  necessarily  reproduce 
invariants  such  as  the  Aa  and  the  B ^  discussed  by  us. 
The  reason  is  that  a  least-squares  tracking  of  a  data  set 
y(n)  by  a  map  y in  +1  )~F(y(«),a)  does  not  necessarily 
provide  a  good  evaluation  of  the  local  tangent  space  map¬ 
ping  M,j =dF,(y)/dyj.  A  map  which  is  reliable  and  also 
gives  the  correct  invariants  we  call  representational.  Our 
maps  are  representational  because  we  constrain  the 
least-squares  minimization  by  the  invariants.  A  map 
which  closely  tracks  data  but  does  not  yield  the  dynami¬ 
cal  invariants  misses  the  essential  ingredients  which  clas¬ 
sify  or  identify  the  dynamical  system  underlying  the  data. 

Another  way  to  state  our  constrained  optimization 
procedure  is  that  the  cost  function  to  use  in  determining 
the  map  should  not  be  composed  only  of  the  square  of  the 
residuals  in  the  predictor.  It  should  also  contain  terms 
which  measure  the  residuals  in  matching  the  invariants 
determined  by  the  data  and  the  same  quantity  determined 
by  the  maps.  NPSOL  and  other  contemporary  optimiza¬ 
tion  routines  do  essentially  this  by  a  combination  of 
Lagrange  multiplier  and  quadratic  penalty  terms  added 
to  the  least-squares  cost  function.  This  point  of  view  sug¬ 
gests  that  we  should  not  focus  on  the  size  of  C(X,a) 
as  our  goodness  of  fit  criterion  but  on  C(X,a) 
+ our  Tables  III  and  IV  we 
have  reported  the  values  of  each  of  these  quantities  sepa¬ 
rately,  but  the  sum  as  noted  should  measure  the  merit  of 
our  mnps. 

In  practice,  carrying  out  our  signal  processing  program 
raises  a  number  of  issues  of  importance  in  dynamical  sys¬ 
tems  as  well  as  in  the  present  context.  The  first  of  these  is 
the  determination  of  the  dimension  d  of  the  embedding 
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space  in  which  the  phase-space  reconstruction 
jc  <« ) — *-y(rt)  takes  place.  We  have  used  the  correlation 
function  Eq.  (6),  but  the  choice  of  a  dimension  at  which 
this  stops  changing  is  quite  subjective.  Establishing  an 
objective  criterion  would  be  most  useful.  Perhaps  one  of 
the  information  theoretic  criteria  developed  in  statistics 
for  identifying  the  number  of  degrees  of  freedom  in  a  data 
set  would  provide  a  tool  here/2  An  objective  criterion 
for  establishing  the  time  delays  ra  would  also  be  desir¬ 
able. 

Methods  for  determining  the  Lyapunov  spectrum 
A,, . . . ,  Arf  from  the  data  are  also  quite  important.  These 
are  classifiers  of  the  dynamical  system  and  a  representa¬ 
tional  map  must  reproduce  them.  This  is  not  at  all  a  new 
issue  as  should  be  clear  from  the  discussions  in  Sec.  III. 
Our  own  work  in  this  area,  which  will  be  reported  in  de¬ 
tail  in  a  subsequent  paper,  uses  local  maps  of  the  form  of 
our  F(y,a)  and  fits  the  parameters  a  and  a  to  the  tangent 
map  at  every  time  step.  The  local  tangent  map 
'M(a(n)),y  takes  groups  of  phase-space  points  in  the 
neighborhood  of  the  orbit  point  y(n)  into  groups  around 
y(n  + 1).  The  dependence  of  M  on  y  is  sensitive  to  the 
variation  of  M  over  the  neighborhood  of  phase-space 
points.  When  one  has  short  data  sets  and  thus  sparse 
neighborhoods,  this  dependence  on  y  gives  a  better  ap¬ 
proximation  to  M(y)  than  a  local  constant  matrix.43  The 
eigenvalues  of  the  product  of  the  local  M’s  along  the  or¬ 
bit  yield  the  Aa. 

As  should  be  clear  from  our  discussion  of  the  structure 
of  the  parametrized  map  F(y,a),  if  we  remain  with  our 
generafform  (which  we  do  not  insist  on),  then  properties 
ofg(y,y(n);a)  are  what  we  must  address.  Our  choice  in 
this  paper  has  been  to  use  scalar  products  of  y  and  y(  n ) 
in  forming  g.  These  are  insensitive  to  directional  infor¬ 
mation  on  the  attractor.  The  structure  of  neighborhoods 
of  phase-space  points  near  the  orbit  y(n)  is  not  isotropic, 
so  much  of  the  information  in  our  data  may  be  used  in 
our  present  choice  of  g.  Since  we  want  g  to  provide 
direction  sensitive  weights,  we  might  wish  to  build  in 
some  of  the  local  phase-space  structure  on  the  attractor. 
Some  of  this  information  is  contained  in  the  correlation 
function  among  points  in  the  neighborhood  of  the  orbit. 
If  an  orbit  po;nt  yin)  has  NB  neighbors  ye{n)  within  v^a, 
the  correlation  function  is 

1  Ng 

Ry(n)=— -  2  [>^(n)-.>’(n)]/[.v^(n)-y(n)]f  . 

" B  0~\ 

Following  a  suggestion  of  Fukunaga22  we  would  use  the 
local  correlation  matrix  in  ourg(y,y(n);a)  by  making  the 
replacements 

|y-y(n)|2-*  2  Lv-3’(»)],TF,7l(n)[y-y(«)]/ 

i 

and 

y(n)-(y— y(n))-*  2  y(n)iW,Jl{n)[y  — y(«)]y  . 
i 

This  now  emphasizes  directions  in  phase  space  along  the 
attractor  where  the  correlation  is  larger. 

In  addition  to  these  improvements  in  our  ability  to  per- 
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form  etch  element  of  aur  processing  program,  the 
application  of  methuts  established  he**  to  laboratory  and 
field  data  would  be  quite  productive.  The  applications 
would  be  both  to  classification  by  dynamical  invariants  of 
observed  broadband  signals  and  to  prediction  on  those 
signals.  Further  having  a  cigar  idea  now  of  the  geometric 
setting  in  which  the  signal  processing  takes  place  in  time 
domain,  we  can  begin  exploration  of  these  methods  to 
control  of  nonlinear  systems.44 

Finally,  there  is  the  matter  of  noise,  extrinsic  noise, 
which  contaminates  our  broadband  signal  x(n).  Many 
conventional  methods  for  identifying  signals  in  noise  rely 
on  the  distinct  spectral  characteristics  of  the  two.  That 
tool  is  absent  for  us,  and  we  must  use  alternative  tactics. 
We  do  not  have  a  contribution  to  this  important  issue 
which  we  have  tested  out  in  any  quantitative  way.  A  nat- 
uml  framework  will  be  the  distinct  dynamical  charac¬ 
teristics  of  noise  and  chaotic  motion  embodied  in 
differing  dA  (finite  for  a  chaotic  attractor  and  filling  any 
dimension  for  noise),  invariant  density  p(y)  (structured 
for  chaotic  time  series  and  homogeneous  for  noise),  and 


other  similar  attributes.  We  will  report  on  our  tested  .'1 
ideas  in  this  matter  in  future  articles. 
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which  we  hive  been  working,  a  value  of  if  =s  10  — 15  generally  determines  an  optimal  sampling  1  and  remain  constant  as  d  increases  in  value  to  IS,  2(1,  25  etc.  If  the  position  of  the  minimum 

rate  t,  =  -f  for  which  the  singular  value  spectrum  converges  quite  well.  We  have  found  continues  to  increase,  this  is  a  good  indication  of  being  in  the  incorrect  regime  for  at  least 

that  sampling  rates  within  a  factor  of  two  of  this  estimate  produce  equivalent  results.  one  of  the  parameters. 

In  contrast  to  the  standard  time-delay  techniques  for  determining  the  embedding  space,  Finally,  we  would  like  to  stress  that  the  data  reqiiin-nx-iits  for  the  DDL  are  quite  minimal, 

we  require  a  priori  a  proper  sampling  lime  r,  for  the  analysis.  Although  this  may  at  first  For  the  parameter  regimes  defined  ahoic,  for  example,  the  embedding  dimension  of  llie 


DDL(S)  _  •  DDL(S) 


DL(S) 


DDL(S)  for  Vortex  Dynamics 

* 


P 


10  J 


ft 


ft  ft 


ft  * 


ft  ft 


rnT'pnrrrjrTrrrrrTTi'i^rTy^rrr^iTTTrrr^rTTTn'i'ri^ 


1  2  3  4  '5  6  7  _ 

Embedding  Dimension,  b 


10 


Ficintr.  3 


r 


W  v,  i  J!  u  C  *£  «?U*»  t-  e  J 

,S  o  =  5  a  «  fc  °  ©Sc-®  2  J!  v  E 

2  I  <>  g  S  e  >  6  5  8  •§  >,-=  -S  f  « 

E  S  ».!  11  5  5  S' 

E  J  5-s*!  S  §)g  MS*  S- 

u  '§.x  E  “  ")>'cu^^Soo'5g 

5«5^=j!;r2c2„a—  s  ,  =  w 

i-  !  1 J  |  x-s 

•SiSl'|-52Su§ 

e^J.f.|usj-g23 


I  S 


CD 


•-  i  =  3  g  j$ 

J  ,  f  §Q  • 

e^J-7.sSsJ-gSaS-gJ'-- 

S':  °  -  O  -  S 

u>£>“cE’mU<2c-w*-S 

:i  *  5  «  •=  2  .s  s  S  J2  £  .  =  5 

S  o  C  —  J£»  c  £  c  3  £  ~  c 

SSie'o'jllSse^fc 

=  ^S?  =  c 


t!  c  r:  e  xr< 

5  A  «  o  s  U  i  v 
>»  ^  5  «H  ®  *U  ]?•  *o 

0  -5  3  2  g  2  «  * 

Ju?  «-sf 

?lsiSs-g^3 

S  oD.SO.-^ 

“Jii'—T-OCw  . 

5°Or“."  •’*•" 

1°  =  s  r 

j  r  t «  !  -o  s  s  g 

C-  c  >i  ,hc-  c.  ^ 

X  -=  ■£*-  ■£  "  S  c  2 

,-S  ©  <e  u  c  Q  *5  ^  • 

?  S  S.S  oW  * -g<3£ 

J2  _2  3  r  -  c  t-*o 
c  —  <  7  r  J  t  o  ^  • 
c^cE  .  S  c.  d; 

*  “  r  o  <*  .  —  -  n  S  '/3 

*  £  o  £  a  ~  2-0  . 

r>  C  r.  —  i-  K  S  F“ 

<i  {/? 


3  d  m 
u,  7 

■c  c 

X 


8  § 

-  ®  a  f  » r,  r  :  s 


E&sl  s 


S  8  a 

'  ■=  -o 


_  w 

JU  =  i. 


_  u  o  ^ 

is  £  V  Q.  c. 

a.  o  w  =*  «n 

Q.  w  i;  o  Q 

V*>.  -  B  <-»  «  *J  o  2  , 

X  «  2  .5?  -  „  -  E  M  o 

0x  v)«:  3  TJ  t  w 

*  n  9  „  j=  SccKc 
' o  -  "  3  o-i 

*  <p  ~  ;=  -o  ,2  ,£ 

no  ^  •'J 

C  >»  ,*}  O  *0  £ 

•>  3  »*  «fl  JJ  ■" 

ti  3  C  3  C  ^  u 

fxc^si  ®  a  I  I  8 

I  E  s  2-S  2  q  o  s  1 a 

£  ?  -o  ,S  ■?  Q  §2  gj? 


V  U  c  .  „ 

S  JS  *S  2  —  v 

>,  "C  o  »  *c 

"  £  'o  x  2  O 


gj  5! 


8§'5s£-sIfSSti=|lJ 

x  S  •s'  I  S^o^xlh-g 

■  S;s?«fS=i^' 

,  c  cL/1  -=  v>  -*  r;  a  ‘  “  u^—  —  u  >* 

«=  S-Q-  -  3  "  x,2C-=  =  5 

iHS  i  c  St  5s  t”  S  C£  :  S  o 
ic!.S'-i  „  e-:’=-*.H2  :  s  ‘5 
*S  "q.  >  .p.  -a  —  *>  .2  5  ^  5  U  *£  -a  c 

«j  t)  o  A  ou-  &  ^rCp-r;Ov_  j- 

^  -5  ^  ^  3  O  -g  -  .2  |>o“5jc'§ 


c  o  t>  3  3 

♦*  w  p  — 

U  ..  .E  •“  T5  ■- 


S  i'S’cylg  g  ?  §  5 

^:•0oS,^S>HPUS", 


£  =  Jc  O 
c  •«?  o'0 


—  w  ifl  JJ 

8  2  S  s  - 


VJ2 

C  J3  rt 

O  a.-o 


«;  s  w 

_.  V)  J- 

c  o  t:  Vi 
» »  ■/>  o  JA  *.  O  n  VI  •* 

_  •«  -S  *  2  OJ  e  i-S 

J  J  8.1  2  '5  2  §  §  o  T  2 

**  (,)  ii  <  o  2^  o 

U-OJC  E-»2- 

M  ew  v’  O  O  71  41  w 

?  5  E  :  i  -  S  j  -  ^ 

Ui.u20o04^1 

o-=  E  SOi 

X  °  •  #J 


~  y»  w 

r*11  o  s 


p e . 


rt  O 


S':  5  ”  i  E  5Q-. 
GpSgS'iilucif 
?2  §  -2  8  8.J  -5  a  2  . 


1-S 

c 


'i|. 

c  3 
.2  S 
«  u 

E  £ 


e 

o 

s 

u 

o 

-c 


.£  -S  3 


a 

o 

s 

o 

to 

X3 

Of 

o 

c 

u 


B  ■  M  «A 

-*e  2-^  c  O  ^  «  - 

i„  =  '"  -  -pPcSc 

Sj-I.S'lS  g  _  '» 

wC'coh.SKCi 
i£  .  _u  7:  ^  ■>  c^to 

w  V—O  ^  £  V-  O 
£  c  »-  (j  E  o  ^  J  v  u* 

c  S  Q  c  S  r  <  |  *5  ^ 

.£.£.£<  C  3  ^  a.  »c 
Cl£“*»  -  u  o  x  o 

v  *1  £  U  ,2  •*  t)  ■—  *j 


8  $  §■  g> 


S52S« 

u)  a  bo  5, 


E  I  |  a  2  c  S^J-S1 

V  c  "o  'K.  bo  ^  ><  JJ 

5  4>  ^  o  S  «  .2 

#0  JS  ^ 


8  O*  "O1  *>  2 

rr  V  3  *-  .71 

40 n>  §  §  e| 


&o«-xgj-BST3 
cu-o,5  oc 

«jO  nj  o  «J  03  tii  2 


.2  ~3 


CO 


U  M  o 
Os  c* 

if « 

i  1  I 

H  X  ^ 

=  Q  ~ 

•BS  s 

5  ? 


£  “c 

.s.0  ?c 

</>  '3  — 


C  p  V 

■§  a,  - 
U  *c  o 


«  3  *« 

u  ** 
m  ,5«  u 
^  <0  £ 


«i*  j 

3  Q 

8  *« 

d| 
>T*  *4 

•— *  O 

.  a 

Q 

3 

2i 

7" 

r 

; 2 ;  CO 
01 

'E 

p 

B 

B 

3; 

■0 

s 

x  g 

.  U 

11  -iz 

C 

W 

U  w 

Q 

6 

d 

2  5 

5  « 

ih  jz 

c*  *S 

«J  5 

jQ  O 

£<S 

N 

C 

V 

0 

N*  •-* 

c  r? 

-  S 

0  «> 

0 

M* 

*5 

V 

-3 

n 

aj 

£ 

M 

J 

<d 

S.  ”  c*  o’ 

Q  -2  5 
a,  ^  u  *5 

O  3  ^ 

R  J"  •*  3 

2  c*  "  -o 

S  2  o  05 

2  O  g  •o 

zi  u  i  S 

<  2j>  I 

a>5  ■=  I 

C  “  £•  v> 
*«><«>» 

^  D  W 

<2  £-5  !a 
£  J?  8  a 
C  .5  o  •£ 

k  1/1 

■■=  £  <.  *« 

u  5  «  c 

c  c3  S  .a 

°-=  “  S 

*§«■§■ 
o  ^  c 

1  J  =:  cS  ^ 

I  1  'f  <  | 

=  £  <  S  =. 
>  c  -»  *C 

jf"  "<3? 

r  «  U  ' 

=  o  5<  s  S 

|Si2p8. 

CO  C.  *.  XT' 

•  Q  °  rtj  to 

J?  i'-sp 

•  *—».„  <j 

*7  e«  3 

•rj  Cl  C  'S 

5  2  •-  o  S> 

3  £  c  .8  < 

■  ~D  -  0. 

S  <;  -c- 

§  3‘< 

<  eo  t>  *r- 


2  ‘S  w 

«  .5  Jo 

^  v—  ro 

<  d  g 


S  o  ,-C 
co  <o  e 

tn  ?>  S? 


elf 


<8|S:g  itfsa 

iTfrtSv  EC¬ 
U'*  T*  g  ^  °0 

2c«s<6  i3JS 

to  £ 


C  -o  — * 

2;  3  § 

c  *o  i 


2  "3  d. 
75  ^  §: 
w  c*  7 

~  oo 

%  a  *= 


I  £  £ 
*  , 


C  r>  v» 

,|  i  '§ 
v>  3  ? 
e  w  o 

O  V.  -c 

E  £  * 

5^ 

c  gS 
?'§£ 

Si  I 

I--5 


!  H  £> 

»  5t*  >  u 
•  J3  —  w 

,  .  w  C. 
>  2  C.  <C 

;  .«  <  a 


|o|' 

PS  Jc 

i’ll. 

|JI 

J=  b  .5  ■ 


*TJ  *  CO 
5S  C“>  ’O 

«"S2 

c:^» 

jT”S 

So  O  <7> 
u  tO  «) 
Cl  «-• 

tu 


"3  J5*  w 

e  ^ 

6  S’ 
I  =  -t 

S3  SI-* 

o.—  g 
«  i:  o 
W  §  oi 

G  o  o 
J:  <o 

.2 1  £ 


*5  * 

tt  «  .2 
JS  o  «/> 
:=  ^  >» 
*  «>  «: 
c  »*  c 
C*  u  « 
to  *5 


;  8.1 
u«h 


Co  '-  . 

1  S  § 

*  Q  t) 

OT  a  S  . 

•  .S  •*=  c> 

Q-ils 

•3sJr 

oC  c  Cl 

E  3  oi  7 
°  «  .  g 
2qq° 

n 


5  c  ■=  » 

*‘£g3 

S  |  3  ^ 

go. 

|as  5 

•»U  41  , 


•;  c/i 

»c  5^  -»  - 

J=  «  t 

n  ,  >*  «J 

•  5  2  *3  y 

c  o  "  cn 
««  .1  1 
'f  .?  St 
c.  ;  El 

3  C  u  C 

.  £  c.  £ 

|i-.“ 

n  ^  >  ’> 

^  5  t  J 


.  <c 

O  4>  . 
w  c 
0^:0 

*-•  C  CO 

-  o  c 

c* 

a 

f  31 

sS? 


10Schwartz,  G.,  “Estimating  the  Dimension  of  a  Model”,  The  Anna!.*  of  Statistics  6,  461-464 
(1978);  J.  Rissanen,  “Modeling  by  Shortest  Data  Description",  Automatics  14,  465-471 
(1978).  See  the  comments  by  M.  Slone.  “Comments  on  model  selection  criteria  of  Akaike 
and  Schwartz”,  J.  R.  Statist.  Sac.  R  41.  276-27S  (1979). 


